In this unconventional article about website modding, I’m going to share how I keep page visits in a persistent history. I don’t mean to create a step by step tutorial, as it’s about the concept and interconnected technologies that matter. So unless you are already a developer, this won’t mean much to you. You could say that it’s plugin development. At some level, it’s a WordPress plugin. But I’m not going to share the source with you. Why? Every site is different, and my code will not work for you, as it’s not universal. Writing those few hundred lines is the least of our concern. What I’d like to teach you is the mindset to use anything at your disposal to reach a goal.
What is persistent history and why do I need it?
I have a knack for binge reading or watching content. For me, that means if I find a valuable or entertaining source, I want everything from that. Imagine a content-rich WordPress site with hundreds if not thousands of articles. A prime example is the Steve Pavlina personal development blog. You could download the entire blog to a Kindle (I’m yet to write about how I do that). However, what if you don’t necessarily want to read everything, or at least not in sequential order? You’d soon lose track of what you’ve already read and what is new to you. Browser history won’t help you as it’s not forever. This solution aims to solve that problem by having a custom-made, persistent history service that is in your control.
The solution in a nutshell
Whenever you visit a single post on a blog, or an archive (post list), your browser will communicate with a local server that has the persistent history. It’ll either log your visit or recognize visited posts and visually change the 3rd party site to reflect the read status of single or multiple posts.
For the client side, we use tools mentioned in the Change a website to your liking article. The server side is a simple WordPress plugin that creates a small API that handles the requests and stores everything in a database table.
You’ll see my reasoning behind the technology choices, but generally, you can assume I just pieced these together as they were LEGOs. It’s my take at the problem with the tools I know. I’m sure there is a better or more elegant way.
The client side: a Chrome-like browser
Userscripts allow you to write them for specific sites only, so they’ll not interfere with every website there is. I used Tampermonkey to add a few words somewhere on the target site to let me know that a particular post is already read. Most sites come with jQuery, so inserting an element wherever I want is even more comfortable.
On single posts, this script will send the URL or post ID to the persistent history server. You can get the ID (from the source of the page. If the post has been read before, the server will let the browser know. You can come up with what counts as read. Spent a minute on the article, or scrolled to the bottom? Clicked a button to mark it read? You decide, as you’ll code it.
On archives, the script gets the full list of already-read articles from the server (most conveniently identified by their URLs). After that, it’ll cross-reference them with the handful posts that are seen together in one page of the archive. I add a class to those posts so the userstyle can visually change them.
The calls are AJAX (XHR requests), initiated by jQuery while the data type is JSON. You can use JSONP if you don’t want to mess with cross-domain problems.
I use Stylish to grey out read articles in lists/archives. It’s possible because the userscript marks them. Any (inserted or native) element can receive styles from this CSS, just to make things more beautiful. The “Hey, you’ve already read this!” note is more noticeable if it has a proper style.
The server side: a Raspberry Pi
Required software or dependencies
The server can be any of your WordPress sites on any hosting. I felt like it’s better not to mix things, so I created a new, empty WP site just for this. I don’t like to clutter my desktop PC with server-like services, that’s why I’m not storing the data locally. Therefore, I used an always-on device on the local network, which is my Raspberry Pi. Sometimes I had to install things differently than how it is on Ubuntu. So I’ve learned a lot with this little exercise, namely how to add these to my server:
I’ve had trouble with Apache. Nginx is fast, and I’ve found it easier to configure.
The latest one with all modules required by WordPress.
- MariaDB or MYSQL
Easy to work with, but involved a lot of command line. This will host the persistent history.
I got a little breath of fresh air when I had a GUI to create my table – not having to write much SQL.
- Cloud9 IDE
I use the big brother of this every day, but you can install it on your server as it’s open source. This IDE runs on the server, and you access it in a browser. It exposes the necessary files for editing, allowing me to develop in my browser, but it saves everything straight to the server (no FTP uploads, no LAN file transfers, nor coding inside Raspbian).
I use it as a framework. I know it, and I’m used to it. It’s much easier to leverage WordPress than to write even helper functions from scratch. Yes, I know that it’s an overkill for this “simple” project. But I’ll be using this installation for a few other things too.
The biggest challenge here is that you need a domain (or subdomain) for the next step: SSL. You might not want to expose your web server (ports 80 and 443) to the world – I didn’t. I added the server’s LAN IP along with the hostname (based on my domain) to the hosts file of Windows so that the requests can remain local.
It’s not about being super secret on the LAN. It’s about the 3rd party target blog you are modding. You must have noticed that most sites are HTTPS nowadays. Since this solution integrates with the site as if it were a native part of it, you’ll run into mixed content problems if your server is merely HTTP.
I used Let’s Encrypt from the command line to create certificates. I had to make the webserver component public for it to work, but only very briefly. The certs are issued for a year, after which I’ll need to renew them.
The persistent history API
In my efforts, I just used the GET method. You can create an API based on WordPress by registering custom query vars and reacting to them. When you catch these, you manipulate the output, so no regular WordPress content is present in the responses. This approach is nothing fancy, but you have every function of WordPress at your disposal. I find it easier to interact with the database through WordPress than natively with PHP.
It has two endpoints:
- Return a list of read articles for a given site. This passes a list of URLs or IDs to the client for cross-referencing.
- Take an article and try to mark it read. If it doesn’t exist in the database, now it’ll become read. However, if it’s already there, it’ll not become a duplicate. Instead, the client is notified of the read status.
I had to make the CORS headers allow any origin to access the API. Your origins are the sites you modify this way. When you make requests to the API through AJAX calls, always include the trailing slash in the API URL. WordPress only adds your desired headers to the final page in the redirect chain, and not to the one without trailing slash. The browser – when determining the cross-origin policy – doesn’t follow redirects, and will not see your properly-set CORS header. This gave me a headache!
What persistent history adds to my everyday life
I like having a history that is independent of the one in Chrome (or Vivaldi, my preferred clone of it). I’ve always felt that a lot of user data slows things down. Whenever I install a new system with a fresh browser, everything is so snappy. After a while, once I’ve worn it out, and the personal data reaches gigabytes, having it cleared helps a lot. But at the same time, I lose information. The solution I created helps retain some of that history, those that matter. Now I’m free to cherry-pick articles from my favorite blog(s), knowing that a system will let me know – perhaps a year from now – if I ever encountered that article.
What would you do differently? Do you have any suggestions on how to make it simpler?