We have now received feedback on the pilot deployment of our linkchecker module. We would like to thank Ed Bilodeau of Libraries, Victor Chisholm from the Faculty of Science and Lysanne Larose from the Faculty of Law for their valuable feedback; this has been a great way to validate the package we’ve put together.
What we have learned
- “Linkchecker. Finally. Hooray!”
- Broken link messages at top of node edit pages are useful.
- Broken links report is sortable by URL, Response and Error fields.
- Having both the response code and the error message is useful.
Areas for improvement:
- No checking is done on unpublished pages.
- Some error messages are not clear enough.
- URL field title is misleading, maybe change it to Broken Link or Link to check.
- Operations field title is misleading, maybe change it to Nodes with links to check.
- Edit link should point to friendly URL, not node ID.
- Operations should be the first column not the last.
- The report is paginated with no way to view all.
- The broken links are truncated making it difficult to print.
- Operations field is not sortable, would be nice to be able to sort by source node.
- Messages are displayed to anyone who has edit access, not just Site Managers.
- 301 responses are sometimes confusing.
- Pages requiring authentication are not handled well.
- 301s can sometimes be confusing. 301 means Moved Permanently, which in the case of a simple redirect is easy to understand, but there are other reasons that web servers return 301. The most confusing reason is missing trailing slashes. For example, the URL http://www.mcgill.ca/eps will return a 301, even though it’s a valid site. This is because the correct URL for that page is actually http://www.mcgill.ca/eps/ – notice the extra forward slash on the end. The reason we don’t return a 200 for both is that this could cause search engines to flag it as duplicate content. For more information on this topic, see this Google Webmaster blog post.
- Using friendly URLs instead of using the node ID – This is standard behaviour in the admin interface. Whenever you edit a node at any place in the admin interface, including Content..Edit, you will be taken to a node ID URL. For the sake of consistency, we probably won’t change this just for the linkchecker interface.
- Operations as the last column. This is also a standard UI decision across all Drupal admin pages so will probably stay the same also. However we could look at adding another column for the source information.
- Messages displayed to all editors. This makes sense for most sites, as the users editing the content are the ones who will be fixing broken links. However it might be worth looking at making the permissions for this more granular for edge cases.
- For checking of unpublished content, it’s worth reading this issue in the linkchecker issue queue that discusses why linkchecker does not check unpublished nodes
Where we go from here:
The linkchecker module is not McGill developed, it’s a module from drupal.org which has been around for a long time. If we decide to make any of the suggested changes, the route we will take will probably be:
- Create an issue in the drupal.org/project/linkchecker issue queue.
- Possibly make the change ourselves and submit a patch to the issue.
- Alternatively, wait for the module maintainer to make the change.
In the meantime, none of the feedback is critical enough to prevent us deploying to a wider audience, so we will probably enable the linkchecker on all sites fairly soon.
If you have other feedback on this project, don’t hesitate to leave a comment on this post.