Developer Diary, Marathi Language Day
Primarily in Maharashtra, today people celebrate Marathi Language Day, sometimes referred to as a Pride Day, set to the birthday of poet Kusumāgraj (Vishnū Vāman Shirwādkar). You all know, by now, that I can’t really pass up a minority language…
No, I won’t try to pretend that programming languages have a serious connection to human languages in trying to build a segue to the rest of the post…
The big headline, this week? My Twitter archive no longer rests on a pile of URL-shorteners. I wrote a clunky but operational script to dig through a file, pick out the URLs, and pull every shortened URL to find where it redirects—possibly repeatedly, since we sometimes have multiple layers of software, each of which wants its own tracking—and replaces it.
After running that against the
tweets.js file from the download, most shortened URLs have vanished. Those few that remain have broken in some way, whether cut off in an old-style retweet overflowing the old length-limit or the URL no longer exists in the shortener’s database. In the former case, the tweet contents didn’t originate with me, so sharing it probably already gets ethically dicey, let alone trying to alter what someone else posted. In the latter case, I don’t know that I can do anything to fix it, if I shared somebody else’s shared link, and they removed it.
More importantly, though, the shortened links dropped to a tiny fraction, which allows the pre-generated analytics to show my actual most-shared sources. Most of them don’t surprise me, though I can’t quite figure out what fifty-three Twitter links might refer to. The tweets also no longer rely on the continued existence of the
t.co server or Buffer, which accounted for by far the largest fraction of the shortened URLs.
Oh, and inevitably, many of those un-shortened URLs now have extremely long query strings that nobody needs. I don’t like how it looks, but also I don’t see any serious harm in leaving them in, especially when compared to the harm that could occur if a “false positive” removes a query string that a reader would actually need to see the correct page.
While I had the repository open, I also fixed the color contrast for the search results in dark mode. That styling turns out to not come from the main CSS file, but the search library itself, so I needed to override that color.
Mastodon Tool Trunk
I started the week writing some documentation for this project, since I realized that it didn’t have any guidance beyond the initial project description.
Because the script has a lot of options and the toots, content warnings, and image descriptions can all run fairly long, I also added a script for testing. It doesn’t actually accomplish much of substance, but it does create a worthless image for the script to upload, and then uses it as a media attachment for the scheduled toot.
I don’t know who uses this repository as it stands, other than myself. I doubt anybody, given the number of extremely specific choices that you would need to have made for it to produce a worthwhile newsletter for you. But regardless of any speculation, I’ve had a recurring problem with it that usually takes me an extra try (or two) to figure out why certain blog posts don’t get URLs.
Unsurprisingly if you’ve read through the code, this happens because I still have a manual step to the process, copying my blog’s current
blogurls.json file to the newsletter generator’s folder. I kept it as a manual step, because I didn’t want the management script to try to guess where to find the blog output.
That problem already has a ready solution, though: The configuration file. It now has a place to note the blog’s folder, so that it can copy the correct file automatically.
Even though I use it with some regularity throughout the year, I think of my quote-management script as something of a forgotten repository. The code uses no libraries, so I never get notices telling me to update something. The complicated parts—the download and parsing code—rarely need another attempt. And the code to print a random quote almost always works. As a result, I tend not to think about it, unless I need to fill a quote-toot “slot.”
A couple of weeks ago, though, I got an error for trying to print a footnote for a quote that didn’t contain a footnote. As a result, the script now checks to see if such a note exists before trying to print it.
This doesn’t quite involve a change to anything except my process, but starting with yesterday’s post on the Five Stages of AI Grief, if you look at the source of posts, you’ll start to see an item called
teaser in the post front-matter. For example, this post’s source describes itself there as follows.
This week’s updates include my Twitter Archive, the Mastodon Tool Trunk, Ham Newsletter, Quotation Extractor, the blog itself, and an updated library.
That summary won’t (generally) appear on the blog, but rather goes in announcements. If you follow me on at least some platforms other than the blog’s RSS feed—Buy Me a Coffee, Cohost, Post.News, Spoutible—you might notice that my blog announcements usually have more too them than the title and URL. I’ve given them more attention because the sites have growing communities where I might get lost in the crowd, and I manually post to those places, so I wrote those on the fly.
Writing the blurbs at post-time doesn’t make much practical sense, though. It separates content that I’ve written for the blog from the blog. It doesn’t get the same spelling and grammar checks that the posts do. And writing them at the last minute means that I need to do extra work at a time when I want to get the announcements out. By pulling the teaser inside the post’s source file, I fix all those problems…except that I need to manually post to all those sites, but they’ll develop APIs eventually.
In addition, now that I have this as part of the post, I can think about adding it to the automated Mastodon and quasi-automated Diaspora announcements.
Entropy Arbitrage, Part 2
OK, I “shaded the truth” about limiting the blog changes to my personal process.
Yesterday, my laptop had a minor…well, not a crash, but it stopped working usefully minutes before I wanted to publish the post. Ubuntu declared my hard drive read-only, until I ran repair utilities on a reboot. As a result of those repairs, a couple of files got corrupted.
In particular, the cache for the blog’s GitHub plugin—the plugin that pulls and formats those little preview tiles that you can see for repositories over to the right—became some nonsensical binary file. As far as I knew, I didn’t need to preserve the cache, especially when going into rebuilding the blog to publish the next post, so I deleted it. When that caused an error, I created an empty file. When the empty file caused an error, that looked like a deeper problem.
In the end, this had a straightforward fix. Since every invocation of the plugin rewrites the cache, at least if it needs to pull new information from GitHub, it now skips parsing the file and returns an empty object, if the file has no content to parse.
Mind you, this raises questions about why I haven’t had this problem regularly—the
rebuild.sh script explicitly replaces the cache with an empty file, after all—but it seems to solve the overall problem.
And I suppose that I should worry about needing to fix the hard drive. It has happened more than once, so maybe I should start pricing replacements…
Because I didn’t have anything better ready to go, I bumped the requested library version on Renew DB.
As you can read, I didn’t really get around to doing anything useful with the Mastodon Tool Trunk, particularly the scheduler that I need, so I’ll need to focus on that.
Credits: The header image is Ancient scriptures on the walls in Big Temple, Thanjavur - 2 by Junykwilfred, made available under the terms of the Creative Commons Attribution Share-Alike 3.0 Unported license.
By commenting, you agree to follow the blog's Code of Conduct and that your comment is released under the same license as the rest of the blog.Tags: programming project devjournal