Environmental
Data Journalism
Academy

for the Philippines

Behind-the-Scenes of an Earth Journalism Network Fellow’s First Data Story

By Boonyanin (Genie) Pakvisal, 2021 EJN Data Journalism Fellow

In Locked in – Why Thailand buys electricity from Laos, EJN fellow Boonyanin (Genie) Pakvisal uses data to challenge the prevailing narrative that Thailand should simply stop importing hydropower from Laos to solve its problem of excess energy supply. Instead, she probed why hydropower imports from Laos continued to grow, who profits from the excess hydropower imports, the legal and regulatory systems that enabled greenwashing and profiteering and who ultimately pays the price. Fellows applied rigorous research methods to ensure reproducible, transparent investigation through an inquiry-based approach using a tool called a Masterfile. In this first-person account, Genie walks aspiring data journalists through the Masterfile process she utilized for her investigation and how it transformed the way she approaches journalism. 

Background 

In the first step of the Masterfile process, journalists choose an issue and then inform their investigation by exploring a.) global data journalism stories on similar topics to identify innovative data-driven approaches, b.) local coverage of the issue to understand gaps in coverage c.) research into the issue to identify a newsworthy, unexplored angle and d.) data sources that can be utilized for the investigation.

I knew from the start that I wanted to write a story about hydropower in Laos. It was a big topic in the region. It was a puzzle for me - why did we keep building more dams even though there were plenty of environmental groups that were against them? With this thought in mind, I kept looking for data that could explain this phenomenon. 

For the background, I looked at published environmental stories that used data journalism to tell their stories. The stories inspired me to think through how data could drive the story but still be written to be easily digestible for the audience. I still think that the key essence of good data storytelling is that the numbers blend in with the words - they are interpreted and easy to follow. 

A topic of concern is renewable energy - specifically hydropower, in Thailand. I started big then slowly narrowed down. The Stimson Center’s Mekong Infrastructure Tracker was essential to my investigation but I was not sure what compelling story I could uncover. Then I started examining other related data sources and honed in on the puzzling issue of Thailand importing hydropower it did not need. 

I looked at many datasets. Some international sources, such as the Stimson Center database, allowed me to do regional comparisons. I also looked at more local datasets such as the EGAT annual reports. The tools that were taught in the programme - for extracting data from PDFs - were surprisingly helpful for me to gather data from a range of formats. Since I wanted to understand the dynamics at both a cross-border and local level, both international and local data sources were essential. 

Hypothesis and Questions 

In the second step of the Masterfile process, journalists formulate a testable hypothesis. A hypothesis is an affirmative statement that can be proven true or false with the data available. It keeps the investigation specific and manageable but more importantly, it ensures that the journalists exposes the “news” in the data: the important information that hasn’t been reported before. Each hypothesis is tested through a series of questions that use data to measure the extent of the problem, the communities being impacted, the cause of the problem and potential solutions to the problem. At this stage, the journalist also checks to make sure the data they have can be used to answer these questions and scrape and clean the data in preparation for answering the questions.

I pored over my hypothesis, over and over. Then I found that the best solution was that it had to be one sentence. “One sentence, Genie,” my mentor would say to me. That turned out to be some of the best advice because when I was confused about where my story was going or how a piece of data fit in the bigger picture, I would go back to the sentence I wrote. 

In the end, my hypothesis was simply -- “Thailand has excess electricity but it’s still investing in hydropower projects in Laos due to the investment market between the two countries and this investment comes at the cost of the social-economic well-being of people in Laos.”

During this stage of the process, there was a lot of going back and forth--which I think is good--to help refine the idea. I feel that as I sifted through more data, more ideas and connections were made, which made me revise and refine the hypothesis. 

It was then easier to write the corresponding questions to test the hypothesis. There are four categories of questions: problem, cause, impact, and solution. To write these, I would read the hypothesis in the context of the question category and formulate questions for which the answers would prove my hypothesis right or wrong.

The problem I looked at was “Thailand has an excess in electricity but is still investing in hydropower projects in Laos.” Therefore, the questions revolved around Thailand’s electricity reserve, the amount of energy Thailand was buying from Laos and the details of the hydropower projects. 

The cause I was looking at was why Thailand was purchasing electricity from Laos. With this, my hypothesis was that “With the way electricity prices are calculated in Thailand, imported hydropower from Laos keeps the price low for Thai consumers. The imported hydropower comes from the major dams in Laos that Thailand (EGAT) purchases energy from.” To prove this, I investigated how much the electricity imported from Laos cost for Thailand's energy authority. I also wanted to know where the profits were going so I asked what percentage of Laos’ hydropower developments is funded by Thailand? What percentage of Laos’ hydropower developments is funded by China?

The difference between a data story from a traditionally reported story was that the questions I was asking were of the data itself, not necessarily a person providing their own opinion or interpretation of the situation. 

Analysis 

In the analysis stage, the journalist performs and documents the calculations necessary to answer the questions. This process is often called “interviewing your data.” The documentation  steps are often called “bulletproofing your story” because they ensure transparency and reproducibility. The journalist also simplifies the answers into digestible sentences for readers. 

The analysis was done through spreadsheets. Now, I mention this because the use of spreadsheets was very specific. To answer the questions we set out, we would identify the data needed datasets. For each dataset spreadsheet, I created a source tab and a data diary tab. The data diary is a step-by-step account of all the edits and calculations made to the data. The intention behind this is to make the analysis as transparent as possible in case someone wanted to double-check the information later on. This is similar to keeping a recording or transcript of interviews with people: to prove the accuracy of your reporting, whether the source is human or data. 

Once the analysis was done, the data findings were also simplified. For example, for the question: “How much hydropower energy does Thailand currently buy from Laos’?”

From the data analysis, the answer I got was: “In 2019, Thailand's EGAT bought a total of 3947.6 MW from Laos Hydropower projects.”

From there, the answer might be difficult for the audience to understand. So, the answer would further be simplified to: 

“In 2019, Thailand’s EGAT bought close to 4000 MW of hydropower from Laos. That is, around one twelfth of Thailand’s installed capacity. Or in other words, around 9 percent of Thailand’s electricity production.” 

This was carried out repeatedly for all the questions. 

By the end of the analysis, the narrative started to take shape. What was missing was the characters and additional details that would add to the data analysis. To fill in the missing pieces, I interviewed experts and other stakeholders on the topic. 

Interviews

During the interview stage of the masterfile, the journalist, who already understands quite a lot about the topic, brings their findings to people who can either demonstrate the impact of the findings, explain the causes, identify solutions or suggest other data sources. Unlike in a traditional interview, the journalist approaches the interview with very specific questions in which they present the findings of their analysis for comment. 

For the interviews, there were four types of interviews that I hoped to carry out: impact interview, explanatory interview, accountability interview and solution interview. 

I found that the explanatory interview was particularly helpful for my story as I wanted to have an expert that could explain why the trend that emerged from the data analysis was happening. Once I was able to secure that interview, I felt that the story was much more complete. 

After the explanatory interview, the solution interview was also a key interview for my story. As the data projected quite a depressing outlook for the energy sector, I hoped that there could be some voices that introduce potential solutions to the reader. For that, I talked to Courtney Wetherby, a research analyst at the Stimson Center, who was able to suggest policies and reforms necessary to change the system. 

Story Structure

During the story structure stage, the journalist examines all the answers to their questions from their data analysis and human interviews and determines whether their original hypothesis is true or false. In most cases, the journalist will find that the hypothesis was partially true, and a compelling, nuanced story emerges. Journalists consider four potential story structures: Inverted Pyramid, Compare Contrast, Step-by-Step through an Issue or Explainer. The journalists develops sub-heads based on the chosen structure and organize their findings under the appropriate sub-head. From there, the narrative takes shape. 

I chose an explainer structure because I wanted consumers to understand where their electricity comes from so they are better informed about Thailand’s energy policy. At the end of the day, I think the story showed how consumers should have more say in the source of their electricity. However, under the current system in Thailand, there is very little space and option for alternatives. This, in itself, is a problem. The problem becomes even more magnified when the electricity options that are chosen have a perverse impact on the environment and when people have little understanding of where their energy is coming from and who could be suffering as a result. 

[Laos hydropower is a cheap electricity source for Thailand, but the hidden costs impact Thai consumers. Illustration: Jia Dong Lin] 

Visualization

Visualizations are developed to help the audience understand and digest the data-driven conclusions presented in the story. To design an effective data visualization, the journalist writes a headline explaining the message of the visualization, chooses the most effective chart type to convey that message and ensures that in addition to following basic design rules, they also clearly label the elements of the chart and the source of the data. 

The final step of the story was to add visualizations. For the header image, I wanted to include an illustration that would capture the essence of the story. With a data journalism story, I found that a photograph was a bit too simple. An illustration, on the other hand, gives more freedom to include multiple elements and creative expression. 

For the data visualizations, the Thibi team offered support in ensuring the visualization conveyed the editorial message I needed to enhance my story.  I included three data visualizations in my story: a map of the Thai-invested hydropower projects in Laos, the evolution of Laos’ GDP dependence on electricity exports over the years and the price of hydropower compared to other sources of electricity in Thailand. 

Conclusion

The beauty of a data journalism story is in its ability to present new information. Numbers can reveal hidden relationships that would not have been visible otherwise. It changed my perspective on how a story can be found. What struck me at this process was how similar it is to a scientific experiment. We formulate a hypothesis based on our news nose and find information to prove whether our hunch was right.  

Tips

  1. Keep searching - Searching for data was sometimes difficult as there was tons of data to sift through. 


  1. Elevator pitch check - At the end, a data story should be digestible to the audience. I found that a great way to check this was by seeing if I could explain the story in the time span of an elevator ride. If the problem, cause, impact and solution fit together - then I think that's well-researched piece of data journalism. 


  1. Ask, ask, and ask - The great thing about the workshop was being able to ask for help from the mentors and having weekly check-ups. Not only that, I also found it helpful to check data findings with sources. Often I was straightforward with them. I said that I was working on a data journalism story, here are the findings I have so far, how do they sound to you?