Hi!
I need to check if a given url is valid.
Let's say, I have the Url
http://www.microsoft.com
What I do is to strip the domain first to get the following:
microsoft.com
Then I connect on port 80 to the domain and send the following:
And get back:
Interisting for me is the
Wich indicates, that the page was found. Fine.
But if I try it with my domain
pl4.net
I get the following:
Anyone knows where the problem could be?
I need to check if a given url is valid.
Let's say, I have the Url
http://www.microsoft.com
What I do is to strip the domain first to get the following:
microsoft.com
Then I connect on port 80 to the domain and send the following:
GET / HTTP/1.0
Accept: */*
And get back:
HTTP/1.1 200 OK
Server: Microsoft-IIS/5.0
P3P: CP='ALL IND DSP COR ADM CONo CUR CUSo IVAo IVDo PSA PSD TAI TELo OUR SAMo CNT COM INT NAV ONL PHY PRE PUR UNI'
Content-Location: [url]http://tkmsftwbw09/default.htm[/url]
Date: Tue, 11 Jun 2002 21:03:12 GMT
Content-Type: text/html
Accept-Ranges: bytes
Last-Modified: Tue, 11 Jun 2002 16:07:16 GMT
ETag: "72819ee6211c21:8c7"
Content-Length: 27436
<HTML>
<HEAD>
[...] rest of data part
Interisting for me is the
HTTP/1.1 200 OK
Wich indicates, that the page was found. Fine.
But if I try it with my domain
pl4.net
I get the following:
HTTP/1.1 404 Not Found
Date: Tue, 11 Jun 2002 21:20:30 GMT
Server: Apache/1.3.22 (Unix)
Connection: close
Content-Type: text/html; charset=iso-8859-1
Anyone knows where the problem could be?
do you usually access your domain as www.xyz.com/index.asp?
maybe your default document isnt set - just an idea, not really sure
maybe your default document isnt set - just an idea, not really sure
Some (most) servers need the 'host' header in the request. It uses it to identify the website based on the domain name in case of a shared IP.
Thomas
GET / HTTP/1.1
Host: [url]www.pl4.net[/url]
Accept: */*
Thomas
funny on my server i always use the full host header eg www.etcetc.etc
just tested and it doesnt seem to mind if you just use etcetc.etc
interesting..... so maybe so servers dont care sooooo maybe i could shave 4 bytes of all my code that has a web address in it:-)
<thinks to self: really need to sleep
just tested and it doesnt seem to mind if you just use etcetc.etc
interesting..... so maybe so servers dont care sooooo maybe i could shave 4 bytes of all my code that has a web address in it:-)
<thinks to self: really need to sleep
If your server has a fixed and unique IP the hostname isn't required. However when multiple domain names share one IP, you need to tell the server which domain the page is requested for.
So it all depends on the server, but to be on the safe side just always add the host header (or supply the full URL in the GET line)
Thomas
So it all depends on the server, but to be on the safe side just always add the host header (or supply the full URL in the GET line)
Thomas
i have 4 domains on that 1 ip address on the server, in the config i use the fully qulaified host headers, but it doesnt seem to care anyway if i just use the shorter one.
<off topic on>
by the way thomas - excellent neural network stuff on the char recognition.
I've been playing with some time serious stuff , neural nets and GA's have been a sort of favorite with me for some years, yours is the first implementation i've ever seen in asm - excellent work!
<off topic off>
<off topic on>
by the way thomas - excellent neural network stuff on the char recognition.
I've been playing with some time serious stuff , neural nets and GA's have been a sort of favorite with me for some years, yours is the first implementation i've ever seen in asm - excellent work!
<off topic off>
Some (most) servers need the 'host' header in the request. It uses it to identify the website based on the domain name in case of a shared IP.
GET / HTTP/1.1
Host: [url]www.pl4.net[/url]
Accept: */*
Thomas
Thanks Thomas! That did the trick!
GET / HTTP/1.0
Host: pl4.net
Accept: */*
HTTP/1.1 200 OK
Date: Tue, 11 Jun 2002 22:15:10 GMT
Server: Apache/1.3.22 (Unix)
Last-Modified: Mon, 24 Dec 2001 11:36:21 GMT
ETag: "15119ec-246-3c271335"
Accept-Ranges: bytes
Content-Length: 582
Connection: close
Content-Type: text/html
Haha, I always use wininet.dll for this kind of stuff :eek: Let microsoft handle the HTTP stuff, just give me the data :)
Haha, I always use wininet.dll for this kind of stuff :eek: Let microsoft handle the HTTP stuff, just give me the data :)
I need to check thousands of Urls in the shortest possible time :rolleyes:
That's why I use the HTTP protocol directly :)
Not that you need all this information, and you probably already have it, but I am writing my own little http server, and I'm reading the RFC2616, here are some things...if you don't already have them :)
http://www.w3.org/Protocols/rfc2616/rfc2616.html
http://www.w3.org/Protocols/rfc2616/rfc2616.html