mattpointblank’s avatarmattpointblank’s Twitter Archive—№ 16,451

        1. Here is a story in three charts: Two weeks ago, we launched a new search tool at work with 200,000+ results pages. A few days ago, the CPU usage for the database powering the search started to look like this: (eg. spiky, dangerous levels of high usage)
          oh my god twitter doesn’t include alt text from images in their API
      1. …in reply to @mattpointblank
        We spent a few days tuning performance, adding indexes, tweaking queries etc. No difference. We looked at the logs – lots of Googlebot traffic. We checked our Google Search Console. Googlebot was very, very hungry. And we'd just given it almost half a million new pages.
        oh my god twitter doesn’t include alt text from images in their API
    1. …in reply to @mattpointblank
      We realised that each of the pages it was crawling made a call to a "related" endpoint which was slow/expensive to render. We quickly disabled this. Here's the almost immediate impact on CPU:
      oh my god twitter doesn’t include alt text from images in their API
  1. …in reply to @mattpointblank
    Moral of the story: assume all robots are ceaseless, greed-powered machines intent on grinding all of your servers into dust, and code appropriately.
    1. …in reply to @mattpointblank
      The search tool in question: biglotteryfund.org.uk/funding/grants