|
|||
Can I prevent Search Engine from crawling into my subdomain?
I plan to create a subdomain where I have the exact copy of my live and working site, because I plan to play with Wordpress 2.3 before doing upgrade to my live site, and probably will start playing with Wordpress SVN. Now, if I do an exact copy of content, that means that the search engine will start penalizing my site. So, what can I do to prevent any bots, crawlers or spider from crawling to this copy of my site?
Thanks in advance. I do plan to play around in a local environment, but I would prefer to test it in a Linux environment e.g. my webhost. |
|
|||
Since the subdomain is essentially a folder.....how about putting a robots.txt file in the domain root with the following content:
User-agent: * Disallow: /subdomainfolder/ OR use htaccess to password protect the folder? http://www.javascriptkit.com/howto/htaccess3.shtml Last edited by yonghs : 24-09-2007 at 06:49 PM. |
|
|||
Quote:
![]() And I am not suppose to put that in robots.txt in domain root, right? That would be disallowing the search engine to crawl in my domain right? Just wondering how effective this robots.txt is. Is it a Google-only tool, or all crawlers and spiders read robots.txt as well? If I am not mistaken, the link rel=nofollow applies to google only, so I am wondering if robots.txt is the same. |
|
|||
Quote:
![]() It should be placed at the ROOT. You just state the folder to disallow in the robots.txt file. http://www.maindomain.com/robots.txt User-agent: * Disallow: /subfolder/ ***************************************** The above works nicely if you want to block indexing of a folder... but since you are using subdomain http://subdomain.maindomain.com .............. you might put another robots.txt inside the folder http://subdomain.maindomain.com/robots.txt ............... but this time i'm not sure what should be the content???? Is it as below: User-agent: * Disallow: all So, if you use subdomain. You may want to use 2 robots.txt files to be sure. Pease confirm, as I've not used the combination of robots.txt AND subdomain before. Last edited by yonghs : 24-09-2007 at 10:02 PM. |
|
|||
I googled and found this, and it should answered all the questions:
How do I use a robots.txt file? That means I only have to put rules in robots.txt for both domain root and subdomain root. Thanks for the answers peoples ![]() |
|
|||
You can block spiders with a line in robots.txt or use the function in WP which disallows spiders to index the blog.
|
|
|||
I use all the methods here listed, and so far I think that subdomain isn't indexed. How do I check if my subdomain is indexed?
|
|
|||
Copy about one sentence of text from your subdomain's content .. or the title's text and search in Google/Yahoo. See if the search results shows the url of your subdomain or not.
Or type this in Google search >>> inurl:subdomain.yourdomain.com ... and see if any of the sudomain pages are shown. |
|
|||
![]() |
«
Previous Thread
|
Next Thread
»
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
| Display Modes | Rate This Thread |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Search Engine | korbins | Website Programming | 3 | 18-11-2007 06:38 PM |
| search engine in PERL/CGI | kulakudin | Website Programming | 4 | 07-07-2007 07:18 PM |
| Which is your most famous search engine? | ksstudio | Mamak Stall | 17 | 25-04-2004 10:46 PM |
| search engine question... | BanditLeader | Mamak Stall | 14 | 28-02-2004 12:01 PM |
| Search Engine | hymns | Website Programming | 0 | 13-10-2002 02:27 PM |
All times are GMT +8. The time now is 12:36 PM.
Powered by vBulletin® Version 3.6.8
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
LinkBacks Enabled by vBSEO 3.1.0 vBulletin skin by ForumMonkeys.com.
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
LinkBacks Enabled by vBSEO 3.1.0 vBulletin skin by ForumMonkeys.com.

















Linear Mode

