• Good morning
    I have a problem with YOAST SEO/google search console indexing the site http://www.val-tec.it , it tells me that there are 36 pages on the site not indexed, I have tried validating the corrections but it gives me the error: “No: ‘noindex’ detected in the ‘robots’ meta tag”.
    How can I solve the problem? In the WordPress settings I have set for the search engines to find me, so I don’t understand what could be blocking my pages… Can you help me or give me a hint on how to solve the problem? I have already asked the person who manages our site server and he replied as follows:

    “I checked the robots.txt and it doesn’t seem to contain anything limiting:
    User-agent: DomainCrawler
    Disallow: /
    User-agent: SemrushBot
    Disallow: /
    User-agent: AhrefsBot
    Disallow: /
    User-agent: *
    Disallow: /.jpg Disallow: /.JPG
    Disallow: /.png Disallow: /.PDF
    Disallow: /.pdf Disallow: /.mp3
    Disallow: /.MOV Disallow: /.mov
    Disallow: /.avi Disallow: /.avi
    Disallow: /.csv Disallow: /.data
    Crawl-delay: 2

    From what I read, the problem is in those specific pages, it seems in fact that in the code there is a “noindex” in the robots tag that blocks indexing by search engines.
    I tried to look at the template code but I cannot find any reference to robots or the word noindex …
    You should contact the template manufacturer and ask them for clarification.”

    Unfortunately, the problem is not solved. I tried to redo the indexing request but it tells me that the problem could not be solved. if I go to the list of non-indexed pages and open one at random I get this:

    Page indexing Page is not indexed: Excluded based on “noindex” tag
    Detection
    Sitemap No referral sitemap detected
    Referral page. https://val-tec.it/filtri-di-processo-filtration-group/
    Scan. Last scan. 7 May 2024, 07:07:56
    Scan performed by Googlebot for smartphone
    Scan allowed? Yes
    Page retrieval Successful
    Indexing allowed? No: ‘noindex’ detected in ‘robots’ meta tag
    Indexing User-reported canonical content None
    Canonical content selected by Google: URL checked

    I don’t know what to do anymore
    Could you help me solve this? I use the Ocean Wp theme, I am not an IT person and don’t know the code.

    Thanks so much

    Sibi

    The page I need help with: [log in to see the link]

Viewing 8 replies - 1 through 8 (of 8 total)
  • Thread Starter sibi84

    (@sibi84)

    update: I went into the Yoast SEO tools and found the robots.txt file, this is what it says in the file:

    User-agent: DomainCrawler
    Disallow: /

    User-agent: SemrushBot
    Disallow: /

    User-agent: AhrefsBot
    Disallow: /

    User-agent: *
    Disallow: /.jpg Disallow: /.JPG
    Disallow: /.png Disallow: /.PDF
    Disallow: /.pdf Disallow: /.mp3
    Disallow: /.MOV Disallow: /.mov
    Disallow: /.avi Disallow: /.avi
    Disallow: /.csv Disallow: /.data
    Crawl-delay: 2

    Can anyone tell me if there is any error to correct?
    Because I’ve noticed that most of the pages on my site blocked by robots are links to pdf example https://val-tec.it/wp-content/uploads/2023/03/scheda-power-cool-180-evans.pdf

    Plugin Support Maybellyne

    (@maybellyne)

    Hello @sibi84,

    Thanks for using the Yoast SEO plugin. I’ve addressed your concerns below:

    noindex detected in the ‘robots’ meta tag
    Can you provide one of the URLs that were reported to have this issue? Or does it apply to all your content?

    Robots.txt
    Your robots.txt file does not meet our recommended guidelines. I suggest using the following:
    # START YOAST BLOCK
    # ---------------------------
    User-agent: *
    Disallow:
    Sitemap: https://val-tec.it/sitemap_index.xml
    # ---------------------------
    # END YOAST BLOCK

    Thread Starter sibi84

    (@sibi84)

    Hello @maybellyne, first of all: thanks so much for the reply.
    The block for indexing concerns 88 pages, of which:

    • 19 blocked by robots.txt
    • 37 excluded by the ‘noindex’ tag
    • 25 Page scanned but not currently indexed

    Should I put the code block you sent me in the robots.txt file deleting everything else, or in a specific place? Sorry, but I really don’t know anything about code.

    Here are some examples of links to pages I have blocked by robots (most are pdfs of data sheets of our products):

    https://val-tec.it/wp-content/uploads/2023/03/scheda-power-cool-180-evans.pdf

    https://val-tec.it/wp-content/uploads/2024/05/Vestas-Wind-Turbine-Gearbox-Lube-Oil.pdf

    Instead, here are some examples of links to pages that I have excluded based on the “noindex” tag:

    https://val-tec.it/2023/10/12/

    https://val-tec.it/2024/06/13/

    The strange thing about these is that I do not have any articles on the site where I have left the date as the permalink, I have set it as a setting that there is always the title of the page as the permalink so in reality I do not know why these pages with the date as the permalink were created.

    Thank you so much for the advice you are giving me, I really appreciate it

    Plugin Support Maybellyne

    (@maybellyne)

    Should I put the code block you sent me in the robots.txt file deleting everything else, or in a specific place? Sorry, but I really don’t know anything about code.

    Yes, I recommend removing everything else and using the directives I shared previously.

    Here are some examples of links to pages I have blocked by robots (most are pdfs of data sheets of our products)

    According to Google, the simplest way to prevent PDF documents from appearing in search results is to add an X-Robots-Tag: noindex in the HTTP header used to serve the file. We also have additional information on using the X-Robots-Tag here: https://yoast.com/x-robots-tag-play/

    here are some examples of links to pages that I have excluded based on the “noindex” tag

    As we always recommend, these are date archives and redirect to your homepage. Date archives (e.g. https://val-tec.it/2023/10/12/) are based on publication dates. From an SEO perspective, the posts in these archives have no real relation to the other posts except for their publication dates, which doesn’t say much about the content. They could also lead to duplicate content issues. This is why we recommend you to disable date archives.

    I have set it as a setting that there is always the title of the page as the permalink so in reality I do not know why these pages with the date as the permalink were created.

    These are created by WordPress so it’s good that they already redirect to the homepage.

    Let me know if you have follow-up questions. I’ll be happy to answer them!

    Thread Starter sibi84

    (@sibi84)

    I added the lines in the code as per your advice, thank you.
    I went searching on the web for how to disable archive dates as per your advice and found this article: https://vielhuber.de/it/blog/disattivare-le-pagine-archivio-wordpress/

    so I added this code:

    set_404();
    status_header(404);
    nocache_headers();
    }
    }
    add_action(‘template_redirect’, ‘disable_uneeded_archives’);

    but now everything has disappeared from the site on the products page and also news, I have tried re-inserting the menu (which was a link to the articles with the given category ‘products’ and ‘news’ but still nothing can be seen on the two pages of the site.

    https://val-tec.it/category/prodotti/
    https://val-tec.it/category/news/

    I’ve tried going back into functions.php to remove the lines inserted above but it tells me this:

    Your changes to the PHP code were cancelled due to an error on line 10 of the file wp-content/themes/oceanwp/header.php. Please correct and try saving again.

    Error not detected: Call to undefined function oceanwp_html_classes() in wp-content/themes/oceanwp/header.php:10
    Stack trace: 0 wp-includes/template.php(810): require_once() 1 wp-includes/template.php(745): load_template() 2 wp-includes/general-template.php(48): locate_template() 3 wp-content/themes/oceanwp/page.php(12): get_header() 4 wp-includes/template-loader.php(106): include(‘…’) 5 wp-blog-header.php(19): require_once(‘…’) 6 index.php(17): require(‘…’) 7 {main}

    launched

    What have I done???? probably the code I found on that site caused me an error… but I don’t know how to fix it, I don’t know if perhaps it was better to keep the error of archives rather than having this problem.
    Can you help me overcome it?
    Thank you very much

    Thread Starter sibi84

    (@sibi84)

    ps: line 10 of the header.php file is this:

    <html class=”<?php echo esc_attr( oceanwp_html_classes() ); ?>” <?php language_attributes(); ?>>

    Plugin Support Maybellyne

    (@maybellyne)

    I went searching on the web for how to disable archive dates as per your advice

    Date archives were already disabled by Yoast SEO; that’s why this date archive redirects to your homepage. The setting is found in WordPress > Yoast SEO > Settings > Advanced > Date archives. The code snippet you found is unnecessary

    Your changes to the PHP code were cancelled due to an error on line 10 of the file wp-content/themes/oceanwp/header.php. Please correct and try saving again.

    Since you have tampered with the theme’s functions.php file, I will download the theme afresh to get the original copy of that file and replace it. You could also speak to the theme developers.

    Thread Starter sibi84

    (@sibi84)

    thank you very much, I replaced the file and fixed the problem on the site.
    As for the indexing problems I hope the modified robots file will help us, I already set to retry the indexing validation, we will see what happens

    Thanks again for everything you have been very kind

Viewing 8 replies - 1 through 8 (of 8 total)
  • You must be logged in to reply to this topic.