Go Back   Webmaster Malaysia Forum » Website Marketing and Promotion » Search Engine Marketing

Reply
 
LinkBack Thread Tools Rate Thread Display Modes
  #1 (permalink)  
Old 06-04-2008, 04:16 PM
New kid on the block
 
Join Date: Apr 2008
Location: sydney
Posts: 6
Rep Power: 0
tedsewely is on a distinguished road
What is Robots.txt

What is robots.txt
How does it help

Thanks
Ted,
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 06-04-2008, 10:37 PM
yipguseng's Avatar
Lost Webmaster
 
Join Date: Jan 2007
Location: Petaling Jaya
Posts: 831
Rep Power: 36
yipguseng will become famous soon enough yipguseng will become famous soon enough
Send a message via MSN to yipguseng
robot.txt is a text file which tells the internet crawler to and not go to which part of ur hosting folder.

it helps u protect pages or files which u do not want to be listed in any search engine.

that's the brief info which i know about it...i m sure there are something else on this, but i dunno :P
__________________
Boonage - Freedom is Everything
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 14-04-2008, 03:26 AM
limcs's Avatar
Administrator
 
Join Date: Jul 2006
Location: Penang
Posts: 1,720
Rep Power: 10
limcs has a spectacular aura about limcs has a spectacular aura about limcs has a spectacular aura about
Pretty much that's it.

It can be used to block certain search engine bots, as well as block their access to certain pages.
__________________
Read the Rules
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 15-04-2008, 07:01 AM
Novice Webmaster
 
Join Date: Jun 2007
Location: Sabah
Posts: 17
Rep Power: 0
Gordon is on a distinguished road
yea..thats correct.

feel like to add more.....

when spider land on your site.. robots.txt is the first file they looking for....
__________________
SEO Malaysia
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #5 (permalink)  
Old 10-06-2008, 01:17 AM
Novice Webmaster
 
Join Date: Jan 2008
Location: Kuala Lumpur
Posts: 52
Rep Power: 7
Site Booster is on a distinguished road
Arrow

Quote:
Originally Posted by tedsewely View Post
What is robots.txt
How does it help

Thanks
Ted,
While agreeing with all previous posts, I wanna add that robots.txt is a single text file that you add to the root directory of your website. So, even for several search engine agents, crawlers, spiders, ants or whatever they are called, we aren't allowed to use more than one robots.txt.

Besides, robots.txt is one of the two ways of telling some search engines which pages to index and follow and which ones not to. The other way is using "meta robots" at the header section of each webpage's source code.

Happy searching!
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 10-06-2008, 10:47 PM
Seanie's Avatar
Inspired Webmaster
 
Join Date: Mar 2008
Location: pd
Posts: 132
Rep Power: 7
Seanie will become famous soon enough
Quote:
So, even for several search engine agents, crawlers, spiders, ants or whatever they are called, we aren't allowed to use more than one robots.txt.
But you can make your robots.txt disallow different URLs for different spiders - see the protocol page:
The Web Robots Pages
The robots exclusion standard is voluntary, so bad bots may ignore your robots.txt - many honeypots use this behaviour as a trap. NST have got one:
http://www.nst.com.my/robots.txt

Anybody can see your robots.txt too, so even if robots aren't crawling your 'atmpinnumbersreminder.html', people can see you've blocked it.

WMM had a very odd robots.txt, full of all sorts of cruft, maybe they rewrote the URL to something else for a non-Robot user-agent. They seemed to have changed their policy since I started previewing a reply to this thread last night. I've given up trying to get the same enormous text response as last time, just in case you're watching your log, admin!

Google must be crawled by more robots, theirs doesn't contain any cruft:
http://www.google.com/robots.txt
- and they don't discriminate between robots, but they do stop them crawling a lot of stuff.

Any robot can look at anything at exabytes.com.my:
http://www.exabytes.com.my/robots.txt

The government says it doesn't have one:
http://www.gov.my/robots.txt

The CIA says it's secret, so you have to use https, but then it doesn't give it to you when you try. Maybe you can crawl their site, but then they have to kill your robot.

Finally! This is what I was looking for - a robots.txt that has different stuff for different bots:
http://en.wikipedia.org/robots.txt

I hope that's useful.

Oh yeah - one last thing, robots coming back time and time again for the same dynamically-generated content, only with different URL args means your visitors get funny results from their searches, and the robot is fetching many times more pages from your site than it needs to. Some robots can apply regular expression rules so they won't follow form targets, for example:

http://lolyco.com/robots.txt

I hope that's useful!
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 10-06-2008, 11:37 PM
Novice Webmaster
 
Join Date: Jan 2008
Location: Kuala Lumpur
Posts: 52
Rep Power: 7
Site Booster is on a distinguished road
Thanks for all the information although I didn't see why you've quoted one of my sentences! You approved the same thing: There must be one file containing all such information. I think we agree on that.

Besides, recently Google has explained this issue in Google Webmaster Central.

I think it could at least clarify Google's point of view.
__________________
Boost Your Website Business
Organic SEO Consultant
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #8 (permalink)  
Old 11-06-2008, 11:30 AM
Novice Webmaster
 
Join Date: Jun 2008
Location: Subang
Posts: 17
Rep Power: 0
ars@mesrahostin is on a distinguished road
How do you create one?
Is there software to generate it where you can edit the entries?

Abdul Rahman
Microsoft Excel 2007 and Microsoft Word 2007 Flash Tutorials
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 11-06-2008, 12:50 PM
Seanie's Avatar
Inspired Webmaster
 
Join Date: Mar 2008
Location: pd
Posts: 132
Rep Power: 7
Seanie will become famous soon enough
Quote:
Originally Posted by Site Booster
I didn't see why you've quoted one of my sentences
Quote:
Quote:
So, even for several search engine agents, crawlers, spiders, ants or whatever they are called, we aren't allowed to use more than one robots.txt.
We do agree on a fact, but
Quote:
So even for ..., we aren't allowed...
reads (to me) like you're describing a problem. My reply started
Quote:
But...
because I'm explaining how the apparent problem is not really a problem at all.

Does that help? - I can sometimes be over-sensitive to nuance in written language. Such as:

Quote:
I didn't see...
which means that at some time in the past, you would have said "I don't see...", but you are not likely to say it now (you could see, starting some time after that). You would use "I didn't see..." in a post asking why I quoted your sentence, but then understood why before you finished writing the post. I imagine you meant "I don't see...", but what kind of pedant would point out the difference between the present tense and past tense on a forum thread about robots.txt?

If English is sometimes difficult to use, blame the French and the Germans, it's usually the bits borrowed from their languages that cause the trouble.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 11-06-2008, 12:54 PM
Seanie's Avatar
Inspired Webmaster
 
Join Date: Mar 2008
Location: pd
Posts: 132
Rep Power: 7
Seanie will become famous soon enough
Quote:
Originally Posted by ars@mesrahostin View Post
How do you create one?
Use a text editor - the format is very simple, hard to get wrong. You can check your robots.txt online at Google Webmaster Tools too. I wrote a much longer reply a few minutes ago, but the forum ate my post!
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #11 (permalink)  
Old 11-06-2008, 06:05 PM
Novice Webmaster
 
Join Date: Jan 2008
Location: Kuala Lumpur
Posts: 52
Rep Power: 7
Site Booster is on a distinguished road
Thumbs up

Quote:
Originally Posted by Seanie View Post
... but reads (to me) like you're describing a problem ...
Don't look at it like we have a problem here. Take it easy, friend!
__________________
Boost Your Website Business
Organic SEO Consultant
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #12 (permalink)  
Old 11-06-2008, 06:12 PM
Novice Webmaster
 
Join Date: Jan 2008
Location: Kuala Lumpur
Posts: 52
Rep Power: 7
Site Booster is on a distinguished road
Arrow

Quote:
Originally Posted by Seanie View Post
... I can sometimes be over-sensitive to nuance in written language ...
Hey friend, "Don't be over-sensitive!" This is a webmasters' forum, not English teachers'. Besides, it's now decades prescriptive grammarians have lost the battle against the descriptive ones ... Let's forget about it and go back to the topic of discussion if we have anything to add. I'm not going to answer this anymore.
__________________
Boost Your Website Business
Organic SEO Consultant
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply



Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Similar Threads
Thread Thread Starter Forum Replies Last Post
What is robots.txt file mimran2k Webmaster Tools 8 12-04-2008 03:25 PM
Let's talk about CAPTCHA - humans vs. robots/encoding vs. decoding genzy Website Programming 2 04-06-2007 08:38 PM
Robots.txt masrule Other Internet Marketing Methods 4 09-06-2005 06:18 PM



All times are GMT +8. The time now is 07:34 PM. Powered by vBulletin® Version 3.6.8
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
LinkBacks Enabled by vBSEO 3.1.0 vBulletin skin by ForumMonkeys.com.


WebmasterMalaysia.com is Proudly Hosted by Exabytes Semi Dedicated Server.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57