Just headlines

Oct 23. 2019

I rely on mostly web feeds to stay informed. Live Bookmarks was a Firefox feature I was fond of early on. Past Yahoo! Pipes and SimplePie, I learnt to live with Google Reader for a while. Then that service got retired of course, so I gave Digg and Feedly a quick try. Both felt less than homely and, similar to command line alternative Newsbeuter, too much for my needs. I had no use for tagging, searching, sharing, shortcuts, suggestions, summaries or media. I just wanted a preferably browser based helping of the latest headlines every so often if I could.

Please wait

headlines.js Helps read the news.

Time to get creative then. Putting together a dependency free aggregator in everyday PHP is surprisingly straightforward. As a first step, the following could be called to load, parse and display in-terminal snippets for a single RSS feed,

<?php
// Sample RSS 'parser.php' e.g.,
// php parser.php http://rss-feed-url > headlines.txt

$root = simplexml_load_file($argv[1], 'SimpleXMLElement');

// Failed to load feed xml, cut if atom for now
if (false === $root || 'feed' === $root->getName()) {
  die;
}

$tree = $root->xpath('/rss//item');

foreach ($tree as $node) {
  // Leave out the summaries, ad-lib for csv, tsv
  echo <<<EOT
$node->pubDate
$node->title
$node->link

EOT;
}

Expand for atom, apply over an array of sources, cron-schedule and that's base functionality taken care of evidently. A bit flimsy and kind of less portable a solution than my news addiction deserves no doubt. One might choose Go to produce a proper executable instead. Go would also allow for concurrent downloading and marshalling to e.g. JSON. Marvellous, what about the web facing part however? So I can reach for updates on mobile? Cloud deploy? If I were building a whole service maybe. Or else?

Well parsing XML in vanilla JavaScript is thankfully super easy. Utilising DOMParser and document.querySelector is literally all it takes. For example,

function parseFeed(text = 'The result of some fetch request', parser = new DOMParser()) {
  const root = parser.parseFromString(text, 'text/xml')
  const tree = root.querySelectorAll('item, entry')

  return Array.from(tree).map((node) => {
    // What a treat, I can search for node children using fallback selectors
    const { textContent: title } = node.querySelector('title, summary')
    // Need a `pubDate` for RSS
    const { textContent: date } = node.querySelector('updated, published, pubDate')
    // Expect an `href` attr with atom feeds
    const link = node.querySelector('link')

    return { date, title, link: link.getAttribute('href') || link.textContent }
  })
}

And considering how Web Components, Promise.all, template literals and a ServiceWorker backed window.fetch are now widely available, feed reading might after all be reduced to declaring an embeddable custom element,

<!-- client.html -->
<headlines-maybe src="http://cors-enabled-atom-or-rss-feed-url-1">
  <headlines-maybe src="http://cors-enabled-atom-or-rss-feed-url-2">
    <!-- Fetch each `src`, parse, merge and fill in Shadow DOM -->
  </headlines-maybe>
</headlines-maybe>

// module.js
customElements.define('headlines-maybe', class HeadlinesMaybe extends HTMLElement {})

Note how naturally HTML nesting covers joining multiple feeds,

class HeadlinesMaybe extends HTMLElement {
  // ...
  connectedCallback() {
    // Allow nesting, exclude child elements of the same type
    if (this.parentNode && this.parentNode.localName === this.localName) {
      return
    }

    // Make sure fetching avoided unless tag has context
    if (this.isConnected) {
      // Collect `src` urls, including self
      const children = this.querySelectorAll(this.localName)
      const sources = Array.from([this, ...children])
        .filter(o => o.hasAttribute('src'))
        .map(o => o.getAttribute('src'))

      this.render(sources)
    }
  }

  render(sources) {
    // Load, parse, merge and sort, and display results for each source
  }
}

Coupled with a stale-while-revalidate caching handler for the handful of feeds I'm interested in, I find loading times negligible. Module home is @thewhodidthis/headlines ›