This information aids in understanding the strengths and weaknesses of current automated extraction methods.
This information aids in understanding the strengths and weaknesses of current automated extraction methods.
💥
The datasets inherit large re-use potential due to the gold standard nature of the emission metrics and the accompanying wealth of information.
💥
💥
The datasets inherit large re-use potential due to the gold standard nature of the emission metrics and the accompanying wealth of information.
💥
- I expected it would be a very simple annotation task to copy GHG emission values from a sustainability report into a table. It was not, as the high level of disagreement between non-expert and expert coders shows.
- I expected it would be a very simple annotation task to copy GHG emission values from a sustainability report into a table. It was not, as the high level of disagreement between non-expert and expert coders shows.
Again, these expert teams disagreed for about half of the 40%. Only during an expert discussion an agreement was reached about which values would need to be extracted.
Again, these expert teams disagreed for about half of the 40%. Only during an expert discussion an agreement was reached about which values would need to be extracted.
4. represents a total value, not subcategories.
Two human non-expert annotators searched for all GHG values that meet these conditions.
Despite a training session, these non-experts agreed only for 60% of all reports.
4. represents a total value, not subcategories.
Two human non-expert annotators searched for all GHG values that meet these conditions.
Despite a training session, these non-experts agreed only for 60% of all reports.
1. cover emissions for the entire company,
2. are reported according to the operational boundaries of the scopes (according to the Greenhouse Gas Protocol)
1. cover emissions for the entire company,
2. are reported according to the operational boundaries of the scopes (according to the Greenhouse Gas Protocol)
To obtain the GHG emission metrics, we extract these metrics from PDF files with an LLM, GPT-4. This was just to simplify data extraction; human annotators double-checked the values.
To obtain the GHG emission metrics, we extract these metrics from PDF files with an LLM, GPT-4. This was just to simplify data extraction; human annotators double-checked the values.
We present a gold standard dataset containing emission metrics extracted from 139 sustainability reports collected from company websites.
We present a gold standard dataset containing emission metrics extracted from 139 sustainability reports collected from company websites.
Extracting GHG indicators from these reports by hand is a laborious task. Could one automate this process? How well do ML and AI models perform?
Extracting GHG indicators from these reports by hand is a laborious task. Could one automate this process? How well do ML and AI models perform?
Large companies in the EU are required by law to report their greenhouse gas emissions in their sustainability reports. How can researchers use this data?
rdcu.be/eCEyr
Large companies in the EU are required by law to report their greenhouse gas emissions in their sustainability reports. How can researchers use this data?
rdcu.be/eCEyr
Join our #GESISsummerschool course to master techniques for analyzing both digital behavioral and traditional survey data, including web scraping, machine learning, and more — all in R!
Book Now: t1p.de/GSS25-C5
Join our #GESISsummerschool course to master techniques for analyzing both digital behavioral and traditional survey data, including web scraping, machine learning, and more — all in R!
Book Now: t1p.de/GSS25-C5
The Tell-All Book That Meta Doesn’t Want You to Read www.nytimes.com/2025/03/17/o...
The Tell-All Book That Meta Doesn’t Want You to Read www.nytimes.com/2025/03/17/o...
Do not listen to the platitudes of this administration when they say they care about drug overdose, addiction, mental health, or suicide.
Kürzungen, politische Einflussnahme und ideologische Vorgaben durch Donald Trump bedrohen die #Forschungsfreiheit in den USA. Was geschieht – und wie Forschende reagieren.
Im Blog: www.jmwiarda.de/https-www.jm...
Kürzungen, politische Einflussnahme und ideologische Vorgaben durch Donald Trump bedrohen die #Forschungsfreiheit in den USA. Was geschieht – und wie Forschende reagieren.
Im Blog: www.jmwiarda.de/https-www.jm...