New campus-industry startup will generate local stories, build students’ data journalism skills

By KRISTEN CUSSEN
LOCAL NEWS DATA HUB

A new collaborative data journalism project launched by students and faculty from the School of Journalism at X (also known as Ryerson) University will generate stories for local newsrooms across Canada and train the next generation of reporters and editors.

The Local News Data Hub (LNDH), spearheaded by School of Journalism professor April Lindgren, brings together students, journalists, faculty members and industry partners. 

Student journalists and data analysts on the Data Hub’s reporting team identify datasets that can be used to produce stories for multiple communities. Once a dataset is selected, the journalists do the reporting required to produce a story template that is then automatically populated with data for specific communities. The local stories generated by this application of semi-automated journalism are then shared with the Canadian Press wire service and distributed to CP clients across the country.

“I want to do more than just track the bad news about local journalism – I want to do something to try and shore up what’s left,” said Lindgren, who is the principal researcher  for the Local News Research Project. Data from Lindgren’s Local News Map show 449 local news outlets closed in 323 communities between 2008 and Aug. 1, 2021. Over the same period, only 172 news operations launched in 124 places. 

“Our goal with the Data Hub,” Lindgren said, “is to extract customized stories for multiple communities from a single set of data and to make those stories available to  local media. Reporters in those newsrooms can add local comment to the stories or they can be published as is.” 

Michelle Allan, who worked on the project during her second year as a master of journalism student at Ryerson, co-authored the Data Hub’s first story. Her analysis of nearly 69,000 internet speed test results revealed just how bad internet service was in 2020, when the pandemic forced millions of people to work and study from home.

In addition to a national story about the test results, customized stories for 11 communities were distributed via the CP wire and published by local media.

Lindgren said the Data Hub was inspired by two projects in the United Kingdom that mine data for local stories. At the RADAR (Reporters and Data and Robots) news agency, reporters blend journalism with automation by systematically mining open data sources and then drawing upon the data to write stories in template form using natural language processing. The BBC’s Shared Data Unit, for its part, produces national coverage and creates accompanying story packages that include cleaned up data and instructions for local newsrooms so they can produce local versions of stories.

David Weisz, founder of the annual Data Driven data journalism symposium and director of the Humber College StoryLab, said local newsrooms often struggle with data journalism projects because they are strapped for time, money and, in many cases, the necessary skills. 

Weisz, who collaborated with Lindgren on establishing the Data Hub during the winter 2021 term, said it is an opportunity for students to gain invaluable data skills they can take into the workplace: “We’re seeding the next generation of students, so how can we make this the norm as opposed to the exception, that’s how you make a cultural shift.”

Weisz said that the Data Hub’s experiments with automated journalism “can show newsrooms how it actually might save them money if they invest in these kinds of tools that allow them to expedite usually laborious processes.” 

Lindgren said that while mentions of automation in journalism often spark concerns about job losses, the Hub’s approach requires human involvement that is beyond anything a computer can do: “Our journalists need to find the data, identify the story, report the story and then create the templates to generate local versions. The automation comes into it only for that last part of the process,” she said. “Before we get to that stage, a whole lot of decisions have to be made that require journalism skills and judgement calls.”

The Data Hub, she said, will be transparent about where it gets data and will share the raw data whenever possible. It will also produce a document for each story that describes the data analysis process. The methodology for the internet speed test story, for instance, is available here

Once the Data Hub is more established, Lindgren says the goal is to make it a go-to place for local journalists who need help: “We’re aiming to be a resource that news organizations can approach if they have a set of data that they’re struggling with,” she said. “They will be able to come to the Data Hub with the data, and we’ll help them produce their stories.”

The LNDH started as a pilot project in January and is gearing up operations again in the fall.

The owner of this website has made a commitment to accessibility and inclusion, please report any problems that you encounter using the contact form on this website. This site uses the WP ADA Compliance Check plugin to enhance accessibility.