SuperBasic TCP Device - FAO Dilwyn?

Anything QL Software or Programming Related.
User avatar
NormanDunbar
Forum Moderator
Posts: 2273
Joined: Tue Dec 14, 2010 9:04 am
Location: Leeds, West Yorkshire, UK
Contact:

SuperBasic TCP Device - FAO Dilwyn?

Post by NormanDunbar »

I'm sure I read a query, I think from Dilwyn, about accessing his Web Page using the TCP device from SuperBASIC. If I did, I'm completely unable to find it!

Anyway, if there was such a query, the following works, but ....

Code: Select all

1000 CLS
1010 OPEN_IN #3,'tcp_www.dilwyn.me.uk:80'
1020 PRINT #3, 'GET /index.html HTTP/1.1' & CHR$(13) & CHR$(10);
1030 PRINT #3, 'HOST:dilwyn.me.uk' & CHR$(13) & CHR$(10) & CHR$(13) & CHR$(10);
1040 REPeat loop
1050   IF EOF(#3) THEN EXIT loop: END IF
1060   INPUT #3, html$
1070   FOR x = 1 TO LEN(html$)
1080     IF html$(x) = CHR$(13) THEN EXIT x: END IF
1090     PRINT html$(x);
1100   END FOR x
1110   PRINT
1120 END REPeat loop
1130 CLOSE #3
The result is as follows:

Code: Select all

HTTP/1.1 301 Moved Permanently
Date: Mon, 27 Feb 2017 20:22:17 GMT
Server: Apache/2.2.22 (Debian)
Location: http://www.dilwyn.me.uk/index.html
Vary: Accept-Encoding
Content-Length: 320
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>301 Moved Permanently</title>
</head><body>
<h1>Moved Permanently</h1>
<p>The document has moved <a href="http://www.dilwyn.me.uk/index.html">here</a>.</p>
<hr>
<address>Apache/2.2.22 (Debian) Server at dilwyn.me.uk Port 80</address>
</body></html>
Now the "buts":

1. It never hits EOF, so just sits there, looping the loop and printing nothing. CTRL+SPACE required. I suppose I could use the "Content-Length:" header to get the size of the data and work from that.

2. OPEN_IN? WTF? I tried everything OPEN, OPEN_NEW, OPEN_OVER and either got EOF, or errors. I accidentally typed OPEN_IN and it bloody worked. But I'm PRINTing to an input channel. Confused? Yup, me too!

3. There needs to be Windows style line ends. CHR$(13), CHR$(10);

4. You need a blank line after sending the HOST command. Because most/all web sites are on shared servers.

5. I suspect the output won't be helpful. The error return from HTTP, in the first string returned, is always "HTTP/1.1 301 Moved Permanently" - Looks like there's more to the underlying use of HTTP than meets the eye!

6. There's always a blank line between the last header record and the actual content of the page.

7. The web site appears to have moved, but the location given for the new destination is, exactly the same as the one I just opened. I feel the need to RTFM (or at least the RFC) coming on soon!

HTH

Cheers,
Norm.


Why do they put lightning conductors on churches?
Author of Arduino Software Internals
Author of Arduino Interrupts

No longer on Twitter, find me on https://mastodon.scot/@NormanDunbar.
User avatar
tofro
Font of All Knowledge
Posts: 2700
Joined: Sun Feb 13, 2011 10:53 pm
Location: SW Germany

Re: SuperBasic TCP Device - FAO Dilwyn?

Post by tofro »

Norman,

some hints:
  1. The "Open key" for the open command of a socket is extremely important:
    D0=0 (old/exclusive, OPEN) just creates a socket without connecting it to anywhere. Not of much use from S*BASIC.

    D0=1 (OPEN_IN) opens a TCP socket and connects it (what we want when talking to a server)

    D0=2 (OPEN_NEW) Binds a socket to an address/port. Used for server sockets


    Whether it was a good idea to use the OPEN key for this purpose (selecting between at least 3 basic operation modes of the original C interface, thus implementing socket, bind and accept from the original is disputable (it's, at least, annoying).
  2. The response you get tells you didn't specify the URL properly. The server is much happier with this line:

    Code: Select all

    1020 PRINT #3, 'GET http://dilwyn.me.uk/index.html HTTP/1.1' & CHR$(13) & CHR$(10);
    and will actually respond with a page.
  3. A TCP socket will most probably never reach EOF - INPUT is not very suitable to read from a socket - You should instead use GET or BGET in a loop. Unfortunately, S*BASIC has no command that directly maps to an IO.FSTRG trap which should be used here (just give me what's there and put it in a buffer, regardless how it's terminated) - So this must be emulated with a loop around GET, BGET, or INKEY$.
Cheers,
Tobias


ʎɐqǝ ɯoɹɟ ǝq oʇ ƃuᴉoƃ ʇou sᴉ pɹɐoqʎǝʞ ʇxǝu ʎɯ 'ɹɐǝp ɥO
User avatar
NormanDunbar
Forum Moderator
Posts: 2273
Joined: Tue Dec 14, 2010 9:04 am
Location: Leeds, West Yorkshire, UK
Contact:

Re: SuperBasic TCP Device - FAO Dilwyn?

Post by NormanDunbar »

Thanks tofro, I'll give that a try next time I'm at the QPC.

Cheers,
Norm.


Why do they put lightning conductors on churches?
Author of Arduino Software Internals
Author of Arduino Interrupts

No longer on Twitter, find me on https://mastodon.scot/@NormanDunbar.
User avatar
dilwyn
Mr QL
Posts: 2761
Joined: Wed Dec 01, 2010 10:39 pm

Re: SuperBasic TCP Device - FAO Dilwyn?

Post by dilwyn »

The reason for this in the first place is just to document use of TCP from BASIC and some simple examples, as there is no documentation/examples whatsoever it seems for the BASIC user. Hardly surprising nobody uses it, what Jonathan Hudson used to call "the QDOS official secrets act". Yes the information is "sort of there" in a document aimed at C/Linux users, well nigh impenetrable for anyone like me literate in neither.

Information is never voluntarily provided, so on the QL scene (as with others I guess, human nature) the way to get an answer is to deliberately put out something which gives people the excuse to pop up and correct you.

So this is where my example program goes now following the feedback:

Code: Select all

100 OPEN_IN #3,"tcp_www.dilwyn.me.uk:80"
110 PRINT #3,"GET http://www.dilwyn.me.uk/index.html HTTP/1.0"&CHR$(13)
120 PRINT #3,"HOST:http://www.dilwyn.me.uk"&CHR$(13)
130 PRINT #3,CHR$(13) : REMark blank line to end section
140 OPEN_NEW #4,ram1_test_txt : REMark file to accept what is fetched
150 :
160 REMark read back header section
170 contentLength = 0 : REMark length of body to fetch
180 REPeat loop
190   IF EOF(#3) : EXIT loop
200   INPUT #3,t$
210   IF t$ = CHR$(13) THEN EXIT loop : REMark blank line ends header
220   IF t$(LEN(t$)) = CHR$(13) THEN t$ = t$(1 TO LEN(t$)-1)
230   PRINT t$ : PRINT #4,t$
240   IF ("Content-Length:" INSTR t$) = 1 THEN
250     contentLength = t$(16 TO LEN(t$)):REM might include a leading space
260   END IF
270 END REPeat loop
280 PRINT : PRINT #4, : REMark blank line between header and body
290 :
300 REMark read the body part (the actual HTML page)
310 FOR a = 1 TO contentLength
320   k$ = INKEY$(#3)
330   IF k$ <> CHR$(13) : PRINT k$; : PRINT #4,k$;
340 END FOR a
345 :
350 CLOSE #4
360 CLOSE #3
It seems to work, but fully open to further information/corrections/enhancement.

One or two notes, from trial and error:

1. Don't put 'http:' in front of the www. in line 100, it'll stop it working.
2. The bit which ain't clear from examples out there is the need for the HOST command - amazing how many examples out there don't include it and we're stuck without it in most cases. The reason for its inclusion has been mentioned (private/virtual servers)
3. Use of the correct OPEN key is vital, who would have guessed that to write you have to OPEN_IN. Yes, it's "sort of" documented but not really for the BASIC programmer. Having been explained, I understand why it was done in this way, but would probably have never "guessed" it.

The above program collects everything (header and body) which is gathered from the tcp channel into a file called test_txt and copies output to the screen as well so you can see what's going on. If you only want the HTML file, for example, remove the PRINT #4 from lines before 300, that way only the "body" of the reply is collected into the file.

What's returned is in two sections, the header and the body, separated by a blank line. When you send commands to the server, end the commands sent in PRINT statements with a blank line. Ideally use and expect the CR LF combination at the end of each line.

Since there seems to be no EOF, I look at the lines returned and extract the part before the ':' separator to get the length of the "content" from the "Content-Length:" statement returned by the server (the length of the body part, or the html itself), and use a INKEY$ or BGET in a loop of that size to read back the body part.

HTTP documentation is needed for the full list of statements which may be returned such as Date: Pragma: Location: Server: www-Authenticate: Content-type: Content-length: Content-encoding: Expires: Last-Modified: Allow: If-Modified-Since: and so on.

I'll press on to produce a simple document describing how to use it from BASIC with a few simple examples like the above, in case anyone wants to try to use it for some purpose. I'll mention it on here when it's done, then those who know more about this than me can suggest improvements until we have a useful document.


Derek_Stewart
Font of All Knowledge
Posts: 3957
Joined: Mon Dec 20, 2010 11:40 am
Location: Sunny Runcorn, Cheshire, UK

Re: SuperBasic TCP Device - FAO Dilwyn?

Post by Derek_Stewart »

Hi Dilwyn,

How hard would this be to create a Web Browser.

Would complied S*Basic be fast enough I wonder.


Regards,

Derek
Martin_Head
Aurora
Posts: 852
Joined: Tue Dec 17, 2013 1:17 pm

Re: SuperBasic TCP Device - FAO Dilwyn?

Post by Martin_Head »

dilwyn wrote:The reason for this in the first place is just to document use of TCP from BASIC and some simple examples, as there is no documentation/examples whatsoever it seems for the BASIC user. Hardly surprising nobody uses it, what Jonathan Hudson used to call "the QDOS official secrets act". Yes the information is "sort of there" in a document aimed at C/Linux users, well nigh impenetrable for anyone like me literate in neither.

Information is never voluntarily provided, so on the QL scene (as with others I guess, human nature) the way to get an answer is to deliberately put out something which gives people the excuse to pop up and correct you.
This thread http://qlforum.co.uk/viewtopic.php?t=14 ... 386#p12142 has documentation about using the IP drives from Assembler. And this page http://www.dilwyn.me.uk/internet/index.html has IPBasic, A SuperBASIC interface for the IP drivers including documentation.

I also did an article in QUANTA about using the IP drivers from Basic (June -Sept 2016). And I sent an article about writing a simple Web server in Basic , some time ago, but it has not been printed yet.


User avatar
dilwyn
Mr QL
Posts: 2761
Joined: Wed Dec 01, 2010 10:39 pm

Re: SuperBasic TCP Device - FAO Dilwyn?

Post by dilwyn »

Martin_Head wrote:
dilwyn wrote:The reason for this in the first place is just to document use of TCP from BASIC and some simple examples, as there is no documentation/examples whatsoever it seems for the BASIC user. Hardly surprising nobody uses it, what Jonathan Hudson used to call "the QDOS official secrets act". Yes the information is "sort of there" in a document aimed at C/Linux users, well nigh impenetrable for anyone like me literate in neither.

Information is never voluntarily provided, so on the QL scene (as with others I guess, human nature) the way to get an answer is to deliberately put out something which gives people the excuse to pop up and correct you.
This thread http://qlforum.co.uk/viewtopic.php?t=14 ... 386#p12142 has documentation about using the IP drives from Assembler. And this page http://www.dilwyn.me.uk/internet/index.html has IPBasic, A SuperBASIC interface for the IP drivers including documentation.

I also did an article in QUANTA about using the IP drivers from Basic (June -Sept 2016). And I sent an article about writing a simple Web server in Basic , some time ago, but it has not been printed yet.
Thanks Martin, I'll document these when I have finished the page about use of TCP from BASIC.

The "mist" is starting to disperse...


User avatar
dilwyn
Mr QL
Posts: 2761
Joined: Wed Dec 01, 2010 10:39 pm

Re: SuperBasic TCP Device - FAO Dilwyn?

Post by dilwyn »

Derek_Stewart wrote:Hi Dilwyn,

How hard would this be to create a Web Browser.

Would complied S*Basic be fast enough I wonder.
Give me a few minutes... :)

Seriously, if you are talking of a simple text browser based on very basic html only (no cgi-bin, php, Javascript etc), probably technically not too difficult, although a lot of time consuming work to write, the code would need to scan and extract lists of clickable links and image links from the downloaded page.

The speed at which this demo program downloaded the page on QPC2 was reasonable for a fairly short page, given it was downloading a byte at a time. Needs io.fstrg or iof.load support I suppose, I'm sure Marcel mentioned this recently somewhere. Secure connections probably couldn't be easily done. Would need some support for posting data via message forms and the like which is probably possible but I don't know yet.

EDIT:what Marcel mentioned was implementing iof.fload/iof.save in the IP driver, see list of QPC2 updates on his site.

Looking at http documents quickly, it might need to be able to handle various encodings etc if pages are sent say base64 or UUencoded, I need to look into this type of thing more to see what's used where and so on. i.e. are html pages ever sent encoded or is that only in emails?

I'll have to have a look through Martin's IP documents to see if there's relevant information there too.

All this info will eventually appear in the Documents section of my site, along with the original RZ uQLx docs now that XorA has delighted so many with his work on uQLx.

A while back RWAP mentioned the desirability of there being some form of "killer app" for the QL in 2017. Not saying this would be what he was thinking of, but a simple browser and an email program would I'm sure be very welcome (the browser could probably double up as an email viewer to save having to write a separate html viewer).

Now where's my thinking cap? (not had much use lately so I might have difficulty finding it) ;-)


User avatar
dilwyn
Mr QL
Posts: 2761
Joined: Wed Dec 01, 2010 10:39 pm

Re: SuperBasic TCP Device - FAO Dilwyn?

Post by dilwyn »

A couple of small updates to my example program above.

1. I can't get the listing to run on QemuLator. It stops at the first PRINT #3 statement with no error message whatsoever (QDOS) or just "At line 130:1" with no message as such (SMSQ/E for QemuLator and SMSQ/E Gold Card.) I had told the firewall to allow QemuLator access.

2. I hadn't realised it'd be possible to fetch a string of bytes from the TCP channel - vanilla BASIC doesn't have such a function to return a given number of bytes from a channel, but some toolkits like Turbo Toolkit (INPUT$) and DJtoolkit (FETCH_BYTES) do. Obviously, can only fetch up to 32K at a time. This won't work on pre-JS ROMs as the buffer is a fixed 128 bytes max. Only going by the listing below, using INPUT$ or FETCH_BYTES seems subjectively much faster, though no timings done.

With this in mind, I slightly modified the listing to use INPUT$ if fetching less than 32K at a time, or INKEY$ in a loop to fetch byte at a time otherwise. It's probably be possible to fetch chunks using repeated INPUT$ if you had the patience to work out how many fetches needed. Done in this way, all carriage returns are saved to file, whereas the byte-at-a-time version filters the CRs out as it goes along.

I know it doesn't work on QemuLator (don't know why) - is anyone able to test it on a uQLx system? Any other systems with TCP/Ip (SMSQmulator?)

Code: Select all

100 CLS : CLS #0
110 PRINT #0,'Fetching page...'
120 OPEN_IN #3,"tcp_www.dilwyn.me.uk:80"
130 PRINT #3,"GET http://www.dilwyn.me.uk/downloads.html HTTP/1.0"&CHR$(13)
140 PRINT #3,"HOST:http://www.dilwyn.me.uk"&CHR$(13)
150 PRINT #3,CHR$(13) : REMark blank line to end section
160 OPEN_NEW #4,ram1_test_txt : REMark file to accept what is fetched
170 :
180 REMark read back header section
190 contentLength = 0 : REMark length of body to fetch
200 REPeat loop
210   IF EOF(#3) : EXIT loop
220   INPUT #3,t$
230   IF t$ = CHR$(13) THEN EXIT loop : REMark blank line ends header
240   IF t$(LEN(t$)) = CHR$(13) THEN t$ = t$(1 TO LEN(t$)-1)
250   PRINT t$ : PRINT #4,t$
260   IF ("Content-Length:" INSTR t$) = 1 THEN
270     contentLength = t$(16 TO LEN(t$))
280   END IF
290 END REPeat loop
300 PRINT : PRINT #4, : REMark blank line between header and body
310 :
320 REMark read the body part (the actual HTML page)
330 IF contentLength > 32766 THEN
340   PRINT #0,'Using INKEY$...(';contentLength;' bytes)'
350   FOR a = 1 TO contentLength
360     k$ = INKEY$(#3)
370     IF k$ <> CHR$(13) : PRINT k$; : PRINT #4,k$;
380   END FOR a
390 ELSE
395   REM uses INPUT$ from Turbo Toolkit
400   PRINT #0,'Using INPUT$...(';contentLength;' bytes)'
410   k$ = INPUT$(#3,contentLength) : PRINT k$; : PRINT #4,k$;
420 END IF
430 :
440 CLOSE #4
450 CLOSE #3


User avatar
tofro
Font of All Knowledge
Posts: 2700
Joined: Sun Feb 13, 2011 10:53 pm
Location: SW Germany

Re: SuperBasic TCP Device - FAO Dilwyn?

Post by tofro »

Dilwyn,

following the experiments of Martin, I am convinced the TCP/IP implementation in Q-Emulator is somewhat limited (He mentions something like that in the thread referenced above).

QPC, uQLX and SMSQmulator should work somewhat identical, as their implementation is based on the same specification (That of uQLX).

Tobias


ʎɐqǝ ɯoɹɟ ǝq oʇ ƃuᴉoƃ ʇou sᴉ pɹɐoqʎǝʞ ʇxǝu ʎɯ 'ɹɐǝp ɥO
Post Reply