no-index attribute on content:id

  • Hi,


    I recently discovered the no-index attribute however, I was wondering what this actually does in practice since I don't use any sorting of data and simply have images and content from the database.


    Saying that, however I have used Perch PHP commands to retrieve data from the database to re-use in Perch templates so if for example, I marked every field with no-index, would I at some stage run into trouble?


    Also, what would happen to the database in terms of previously indexed content, i.e., would my database shrink?


    Cheers and thank you to the team and developers, for remaining positive at this time of change for our beloved Perch CMS.

  • Hi,


    The rule of thumb is that any field you do not need for filtering or sorting does not need to be indexed. In the context of an article:



    PHP
    1. perch_content_custom('Articles', [
    2.     'sort' => 'date',
    3.     'sort-order' => 'DESC',
    4.     'filter' => 'status',
    5.     'value' => 'Published',
    6. ]);



    Saying that, however I have used Perch PHP commands to retrieve data from the database to re-use in Perch templates so if for example, I marked every field with no-index, would I at some stage run into trouble?

    If you don't need any of these fields for filtering or sorting, you should be fine. If there are fields that you think you may need to filter/sort by in the future, you can either reindex in the future when needed or start indexing these fields from the start.



    Also, what would happen to the database in terms of previously indexed content, i.e., would my database shrink?

    Not immediately. You need to resave the content for Perch to reindex them. For regions, you can use the page republish option.


    In order for the revision indexes to also be updated in the database, you need to resave multiple times (otherwise the number of rows in the index tables may not decrease).


    You can also configure how many revisions Perch keeps in the database in your config file (fewer revisions = smaller database tables). See PERCH_UNDO_BUFFER https://docs.grabaperch.com/perch/configuration/perch/


    PHP
    1. define('PERCH_UNDO_BUFFER', 3);
  • Thank you Hussein,


    So, before I begin adding no-index to every single perch:content template tag, would there be some tags where I don't need actually to add this attribute, e.g. tags that just duplicate content or else hidden tags?


    I am not sure if my logic is correct but with the Magic Quotes situation now rectified, I would like to see if I can gain any kind of optimisation in regard to my data that may yield a performance increase but I don't want to willy-nilly add attributes that don't actually do anything or else will hamper performance in the longer term.


    I set PERCH_UNDO_BUFFER to 2 (I saw a previous post of yours on this subject) as I don't really need to be worried about revisions as I am the Content Editor and developer so can that be set to 1 or 0?


    My site is in development mode as well but I will save questions on that for a later post, along with configuration questions going forward because I have a feeling that with V4, things will change in ways that will allow us to do more but of course this is speculation on my part.


    Best regards

  • Hi


    1. Duplicated content just uses the settings from the first incarnation of the content's id. Not sure about hidden but I guess it can't help to turn indexing off. Be careful how much you add the no-index as you might end up having to re-index at a later date. Removing it doesn't magically re-add the indexes and it can be a pain sorting this out, especially with a live and busy site.


    2. I think the performance gains you are talking about are not worth worrying about.


    3. I generally set PERCH_UNDO_BUFFER dev to 0 and staging and live to 3. It works fine. Who needs 10?!


    4. Don't wait up for v4 :)

  • e.g. tags that just duplicate content or else hidden tags?


    Duplicate tags: Perch just checks the first occurrence of the tag in the template used for the edit form only


    Hidden tags: they won't be indexed regardless. Though Perch automatically indexes some data (e.g. _id) regardless of whether you use it in the template.




    2. I think the performance gains you are talking about are not worth worrying about.

    This largely depends on how much content you have. I've decreased the collection index table by about 60% for a client by limiting what fields get indexed and the number of revisions. Besides the performance gains, even the size of backups becomes noticeably smaller (which can be a business factor if you pay for cloud storage).


    Having said that, it is not necessary to be very strict about this from the start particularly with smaller projects and regions. Limiting the number of revisions with PERCH_UNDO_BUFFER from the start like ellimondo suggested is probably a good idea.

  • Thank you for the replies.


    I have started adding no-index to some fields but is it true that I must save each templated region or content area individually?


    In other words, can I not simply use something like Republish Pages instead?


    Perch already has major performance issues when it comes to saving content, for example: I open all content areas for a particular page in New Tabs of a web browser (any) and go to click each "Save Changes" button individually but alas I must click it again or else use CTRL+S just to see Content Saved Successfully.


    Saying that, it can take up to 10 seconds to actually save content!


    I am holding out hope that V4 actually improves performance of Perch and offers new features as well, particularly in the area of saving content because I'd not want to pick up a web developer job from someone who used Perch before me were I to know the true state of play.

  • I think the republish feature reindexes all regions on all pages.

    Ok, so refresh my memory if you will in regard to a region, e.g., is this a result of perch_content or perch_content_create or is it just an area within a Perch template?


    Just to add, the performance issues I am noting, are more tied to using Localhost, rather than on server hardware but the issue of having to save twice persists and I believe this should be fixed, particularly when you want to do things that require updating a region where no visible content has actually changed.