|
||||
robot.txt is a text file which tells the internet crawler to and not go to which part of ur hosting folder.
it helps u protect pages or files which u do not want to be listed in any search engine. that's the brief info which i know about it...i m sure there are something else on this, but i dunno :P
__________________
Boonage - Freedom is Everything |
|
|||
yea..thats correct.
feel like to add more..... when spider land on your site.. robots.txt is the first file they looking for....
__________________
SEO Malaysia |
|
|||
While agreeing with all previous posts, I wanna add that robots.txt is a single text file that you add to the root directory of your website. So, even for several search engine agents, crawlers, spiders, ants or whatever they are called, we aren't allowed to use more than one robots.txt.
Besides, robots.txt is one of the two ways of telling some search engines which pages to index and follow and which ones not to. The other way is using "meta robots" at the header section of each webpage's source code. Happy searching! ![]() |
|
||||
Quote:
The Web Robots Pages The robots exclusion standard is voluntary, so bad bots may ignore your robots.txt - many honeypots use this behaviour as a trap. NST have got one: http://www.nst.com.my/robots.txt Anybody can see your robots.txt too, so even if robots aren't crawling your 'atmpinnumbersreminder.html', people can see you've blocked it. WMM had a very odd robots.txt, full of all sorts of cruft, maybe they rewrote the URL to something else for a non-Robot user-agent. They seemed to have changed their policy since I started previewing a reply to this thread last night. I've given up trying to get the same enormous text response as last time, just in case you're watching your log, admin! Google must be crawled by more robots, theirs doesn't contain any cruft: http://www.google.com/robots.txt - and they don't discriminate between robots, but they do stop them crawling a lot of stuff. Any robot can look at anything at exabytes.com.my: http://www.exabytes.com.my/robots.txt The government says it doesn't have one: http://www.gov.my/robots.txt The CIA says it's secret, so you have to use https, but then it doesn't give it to you when you try. Maybe you can crawl their site, but then they have to kill your robot. Finally! This is what I was looking for - a robots.txt that has different stuff for different bots: http://en.wikipedia.org/robots.txt I hope that's useful. Oh yeah - one last thing, robots coming back time and time again for the same dynamically-generated content, only with different URL args means your visitors get funny results from their searches, and the robot is fetching many times more pages from your site than it needs to. Some robots can apply regular expression rules so they won't follow form targets, for example: http://lolyco.com/robots.txt I hope that's useful! |
|
|||
Thanks for all the information although I didn't see why you've quoted one of my sentences! You approved the same thing: There must be one file containing all such information. I think we agree on that.
Besides, recently Google has explained this issue in Google Webmaster Central. I think it could at least clarify Google's point of view. |
|
|||
How do you create one?
Is there software to generate it where you can edit the entries? Abdul Rahman Microsoft Excel 2007 and Microsoft Word 2007 Flash Tutorials |
|
|||
Don't look at it like we have a problem here. Take it easy, friend!
![]() |
|
|||
Quote:
![]() |
![]() |
«
Previous Thread
|
Next Thread
»
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
| Display Modes | Rate This Thread |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| What is robots.txt file | mimran2k | Webmaster Tools | 8 | 12-04-2008 03:25 PM |
| Let's talk about CAPTCHA - humans vs. robots/encoding vs. decoding | genzy | Website Programming | 2 | 04-06-2007 08:38 PM |
| Robots.txt | masrule | Other Internet Marketing Methods | 4 | 09-06-2005 06:18 PM |
All times are GMT +8. The time now is 07:40 PM.
Powered by vBulletin® Version 3.6.8
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
LinkBacks Enabled by vBSEO 3.1.0 vBulletin skin by ForumMonkeys.com.
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
LinkBacks Enabled by vBSEO 3.1.0 vBulletin skin by ForumMonkeys.com.














Linear Mode

