Forum Archive - Strange URLs indexed: www.***.com/ar/be/it translation mess

Strange URLs indexed: www.***.com/ar/be/it translation mess
Sep 5, 2011 2:40 pm by HappyEscorts
First of all, thank you very much for your great work, Edward, especially for the new caching system! Question: Google is indexing strange URLs adding several country codes to the URLs such as: http://www.HappyEscorts.com/it/dk/ http://www.HappyEscorts.com/ar/be/it http://www.HappyEscorts.com/sr/ja http://www.HappyEscorts.com/ru/ko/id/zh-CN Clicking any such URLs results in a terrifying translation mess (the pages are mixed with Arabic, Russian, and Japanese content). These are correct URLs (and everything works perfect): www.HappyEscorts.com www.HappyEscorts.com/de www.HappyEscorts.com/fr www.HappyEscorts.com/it Additionally, this issue also results in tens of thousands of 404 errors (according to google webmaster tools). We do not have any such URLs in our sitemaps that are submitted via webmaster tools. Any advice is appreciated, thanks. System: gtranslate PRO v. 1.5.x.26 Artio SEF v. 3.7.4 joomfish v. 2.1.7 at joomla 1.5.20 php 5.3.3 Regards, Peter
Re: Strange URLs indexed: www.***.com/ar/be/it translation
Sep 6, 2011 10:09 am by alt_f4
Hands-on-solution: create a robots.txt in your (Joomla) Root, eg User-agent: * Disallow: /it/dk/ Disallow: /ar/be/it/ Disallow: /sr/ja/ Disallow: /ru/ko/id/zh-CN/ Furthermore you can exclude the error-causing pages in Google Webmaster Tool http://www.google.com/webmasters you already know: To request removal of the outdated cached version of the page from search results: Verify your ownership of the site in Webmaster Tools. On the Webmaster Tools home page, click the site you want. On the Dashboard, click Site configuration in the left-hand navigation. Click Crawler access, and then click Remove URL. Click New removal request. Type the URL of the page you want removed, and then click Continue. Note that the URL is case-sensitive—you will need to submit the URL using exactly the same characters and the same capitalization that the site uses. How to find the right URL. Select The page has changed and Google's cached version is out of date.. Click Submit Request. Regards, alt_f4 P.S. You should upgrade to Joomla 1.5.23 cause of security issues in J.1.5.20 http://www.joomla.org/download.html
Re: Strange URLs indexed: www.***.com/ar/be/it translation
Sep 6, 2011 12:04 pm by Edvard
Hi, Probably you are using the old version. You will need to upgrade to the latest version or use this rewrite rules instead in your .htaccess file. # gtranslate config RewriteRule ^([a-z]{2}|zh-CN|zh-TW)/([a-z]{2}|zh-CN|zh-TW)/(.*)$ /$1/$3 [R=301,L] RewriteRule ^([a-z]{2}|zh-CN|zh-TW)/([a-z]{2}|zh-CN|zh-TW)$ /$1/ [R=301,L] RewriteCond %{REQUEST_FILENAME} !-f RewriteRule ^([a-z]{2}|zh-CN|zh-TW)/(.*)$ /gtranslate/translate.php?lang=$1&url=$2 [L,QSA] RewriteRule ^([a-z]{2}|zh-CN|zh-TW)$ /gtranslate/translate.php?lang=$1 [L,QSA]
Re: Strange URLs indexed: www.***.com/ar/be/it translation
Sep 6, 2011 6:50 pm by HappyEscorts
Thank you Edward, I will try the new rewriting rules. And thanks to you, alt_f4! However, at HappyEscorts we have some 53,000 real pages (not gtranslated ones) and some 43,000 pages are indexed by google plus roughly 10,000 pages are indexed by BING). How can I possibly find out, which of these indexed pages have such strange URLs? And even if I'd had an URL-list, should I add them all to the robots.txt (probably a few thousand disallowed?) - yet it would be almost impossible to remove all these wrong URLs by hand via google webmaster tools or via BING webmaster. Any advice? Does anybody else have similar male formatted URLs in google's or BING's index? Regards, Peter
Re: Strange URLs indexed: www.***.com/ar/be/it translation
Sep 7, 2011 10:50 am by Edvard
If you just change the rewrite rules it will be fixed over time and the strange URLs will disappear from the index.
Re: Strange URLs indexed: www.***.com/ar/be/it translation
Sep 7, 2011 11:54 am by HappyEscorts
If you just change the rewrite rules it will be fixed over time and the strange URLs will disappear from the index. I will do so, Edward. However, can you please add a line to the rewriting rules: We use joomfish for DE - hence, the language DE should not be gtranslated. Thank you in advance Regards Peter
Re: Strange URLs indexed: www.***.com/ar/be/it translation
Sep 8, 2011 8:43 pm by Edvard
# gtranslate config RewriteRule ^([a-z]{2}|zh-CN|zh-TW)/([a-z]{2}|zh-CN|zh-TW)/(.*)$ /$1/$3 [R=301,L] RewriteRule ^([a-z]{2}|zh-CN|zh-TW)/([a-z]{2}|zh-CN|zh-TW)$ /$1/ [R=301,L] RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_URI} !^/de RewriteRule ^([a-z]{2}|zh-CN|zh-TW)/(.*)$ /gtranslate/translate.php?lang=$1&url=$2 [L,QSA] RewriteCond %{REQUEST_URI} !^/de RewriteRule ^([a-z]{2}|zh-CN|zh-TW)$ /gtranslate/translate.php?lang=$1 [L,QSA]
Re: Strange URLs indexed: www.***.com/ar/be/it translation
Sep 12, 2011 8:03 am by HappyEscorts
Thank you Edward! We've found a different solution avoiding double content via trailing slash: www.HappyEscorts.com/de vs. www.HappyEscorts.com/de/ What do you think? RewriteRule ^([a-z]{2}|zh-CN|zh-TW)$ /$1/ [R=301,L] RewriteRule ^([a-z]{2}|zh-CN|zh-TW)/([a-z]{2}|zh-CN|zh-TW)/(.*)$ /$1/$3 [R=301,L] RewriteRule ^([a-z]{2}|zh-CN|zh-TW)/([a-z]{2}|zh-CN|zh-TW)$ /$1/ [R=301,L] RewriteCond %{REQUEST_URI} !^/de RewriteCond %{REQUEST_FILENAME} !-f RewriteRule ^([a-z]{2}|zh-CN|zh-TW)/(.*)$ /gtranslate/translate.php?lang=$1&url=$2 [L,QSA] RewriteRule ^([a-z]{2}|zh-CN|zh-TW)$ /gtranslate/translate.php?lang=$1 [L,QSA]
Re: Strange URLs indexed: www.***.com/ar/be/it translation
Sep 12, 2011 2:52 pm by Edvard
I don't understand what you did and how that should stop multiple language codes in the URL.
Re: Strange URLs indexed: www.***.com/ar/be/it translation
Oct 20, 2013 12:37 pm by nerdcode
I think that whole URL problem should be solved the other way round.. we once had problems using the fix for our Escorts in Germany URL.
Re: Strange URLs indexed: www.***.com/ar/be/it translation
Oct 20, 2013 7:06 pm by Yana
Hi, You can try to add the rewrite rule in your .htaccess file to add trailing slash to the end of the URLs.

SIMILAR TOPICS

German translation changes the centuryDec 31, 2018 3:06 pmReplies: 1Post by: adrievdl
Graphic translation for every language is missingDec 19, 2018 7:54 amReplies: 1Post by: Daryl Dixon
Problem of translation in my websiteDec 8, 2018 12:15 pmReplies: 1Post by: Ibrahim
translation works only with ctrl + f5 in browser help pleaseNov 15, 2018 1:10 pmReplies: 1Post by: milangomes
How to activate neural translationMar 30, 2018 1:53 pmReplies: 5Post by: 5hamm

Try GTranslate with a free 15 day trial