/images/its_a_me.jpg

Welcome!

Homepage & blog of Dr Caterina Constantinescu | Data professional. Psychologist. Paying close attention.

Years later, a new perspective

It has been quite a few years since my last post. So much has changed in the world, it’s now difficult to bridge my previous posts with le sujet du jour : AI, Large Language Models, agents, the automation of white collar work, model welfare. While I was silent on this blog, my career path was actually following the inescapable gravitational pull of AI, such that I found myself working on large-scale projects side by side with an LLMOps team, and then an Observability team supporting an AI platform build.

Revisiting Scrapy: Creating spiders and crawling sites

In my previous post about Scrapy, we covered the basics of how to use CSS and XPath selectors to extract specific content from sites. We also looked at an introductory example for how to scrape a single page containing an e-shop’s active offers and discounts. No crawling between pages was required for that simple example, and we used the requests library to make the introduction extra gentle. In this post, I’ll share a few details about how to create a Scrapy project, as well as spiders which can crawl between multiple webpages.

Getting started with Scrapy: A beginner's guide to web scraping in Python

I’d long been curious about web scraping and am pleased to have finally made a start in this direction. Previously, any scraping job I needed was carried out via import.io, but now I’ve branched out to Scrapy. I’d also wanted to practise my use of Python, so this was a great opportunity to kill two birds with one stone. Here I’ll share my first foray into this area - it may be useful for others who are also starting out (as well as for my future self, as a reminder).

Using Generalised Additive Mixed Models (GAMMs) to predict visitors at Edinburgh & Craigmillar Castles

If you attended my talk on “Generalised Additive Models applied to tourism data” at the Newcastle Upon Tyne Data Science Meetup in May 2019, please find my (more detailed) slides linked below. Some context I’d been curious about generalised additive (mixed) models for some time, and the opportunity to learn more about them finally presented itself when a new project came my way, as part of my work at The Data Lab.

Four tips for designing visualisations in Shiny

I’ve recently presented a toy Shiny app at the Edinburgh Data Visualization Meetup to demonstrate how Shiny can be used to explore data interactively. In my code-assisted walkthrough, I began by discussing the data used: a set of records detailing customer purchases made on Black Friday (i.e., each customer was given a unique ID, which was repeated in long format in the case of multiple purchases). Both the customers and the items purchased are described along various dimensions (e.

Exploring transport routes, journey characteristics and postcode networks using R Shiny

Project aims As part of The Data Lab, I worked on a project for visualising the traffic flow within a subsidised transport service, operated by a Scottish council. This visualisation needed to display variations in traffic flow conditional on factors such as the time of day, day of the week, journey purpose, as well as other criteria. The overall aim here was to explore and identify areas of particular activity, as well as provide some insight into how this transport service might be improved.