Codebook

Column definitions for the public release dataset. 66 columns across 13 groups.

Core (9 columns)

Column Display name Type Description
publisher Publisher string Publisher name (e.g., Elsevier, Springer Nature, Wiley)
journal Journal string Journal title as listed on the publisher website
editor Name string Editor full name as scraped
role Role (raw) string Role as listed on publisher page (unstandardized)
role_std Role (standardized) string Standardized role: editor_in_chief, associate_editor, section_editor, reviewing_editor, editorial_board_member, deputy_editor, guest_editor, other
affiliation Affiliation string Institutional affiliation as listed (cleaned but not canonicalized)
orcid ORCID string ORCID iD (16-digit identifier, if available)
source_url Source URL string URL of the editorial board page scraped
scraped_at Scraped at datetime ISO timestamp of when this record was scraped

Gender (3 columns)

Column Display name Type Description
gender Gender string Inferred gender: male, female, andy (androgynous), unknown
gender_raw Gender (raw) string Raw output from gender-guesser before thresholding
gender_prob Gender confidence float Confidence score (0-1) for gender inference

Institution (ROR) (8 columns)

Column Display name Type Description
ror_id ROR ID string Research Organization Registry identifier
ror_name Institution string Canonical institution name from ROR
ror_country Country string Country of the institution
ror_city City string City of the institution from ROR/GeoNames
ror_state State/Province string State or province of the institution from ROR/GeoNames
org_type Org type string Organization type from ROR (Education, Healthcare, Government, etc.)
latitude Latitude float Geographic latitude of the institution
longitude Longitude float Geographic longitude of the institution

Classification (OpenAlex) (4 columns)

Column Display name Type Description
scientific_domain Domain string Broadest classification level (e.g., Life Sciences, Physical Sciences)
scientific_field Field string Mid-level field (e.g., Medicine, Engineering, Psychology)
scientific_subfield Subfield string Narrow subfield classification
scientific_topic Topic string Most granular topic classification

Journal identifiers (2 columns)

Column Display name Type Description
openalex_source_id OpenAlex source ID string OpenAlex identifier for the journal
issn_l ISSN-L string Linking ISSN (groups print and electronic)

Journal metrics (7 columns)

Column Display name Type Description
oa_2yr_mean_citedness Mean citedness (2yr) float Average citations received by articles in the last 2 years
oa_journal_h_index Journal h-index int Journal-level h-index from OpenAlex
oa_journal_works_count Journal works count int Total number of works published in the journal
oa_journal_cited_by_count Journal citations int Total citations received by the journal
is_in_doaj In DOAJ bool Listed in the Directory of Open Access Journals
is_oa Open access bool Journal is classified as open access by OpenAlex
oa_impact_quartile Impact quartile string Q1-Q4 computed locally from OpenAlex 2-year mean citedness. NOT the Clarivate JIF or Scopus CiteScore quartile.

Editor bibliometrics (5 columns)

Column Display name Type Description
h_index h-index int Author h-index from OpenAlex
total_publications Publications int Total number of works by this author
total_citations Citations int Total citations received by this author
academic_age Academic age int Years since first publication
orcid_source ORCID source string How the ORCID was obtained (scraped, openalex, orcid_api)

Indexing (6 columns)

Column Display name Type Description
indexed_pubmed PubMed bool Journal is indexed in PubMed/MEDLINE
indexed_scopus Scopus bool Journal is indexed in Scopus
indexed_wos Web of Science bool Journal is indexed in Web of Science
indexed_doaj DOAJ bool Journal is in the Directory of Open Access Journals
indexed_cope COPE bool Publisher is a member of COPE (Committee on Publication Ethics)
indexing_count Index count int Number of major indexes the journal appears in (0-5)

Norwegian Publishing Indicator (3 columns)

Column Display name Type Description
npi_level NPI level string Norwegian Publishing Indicator level (1 or 2). Level 2 = top 20% of journals.
npi_discipline NPI discipline string Broad discipline in the Norwegian system
npi_field NPI field string Specific field in the Norwegian system

Funding (6 columns)

Column Display name Type Description
top_funder_1 Top funder 1 string Most common funding source for articles in this journal
top_funder_1_count Funder 1 count int Number of funded articles from this funder
top_funder_2 Top funder 2 string Second most common funder
top_funder_2_count Funder 2 count int Count
top_funder_3 Top funder 3 string Third most common funder
top_funder_3_count Funder 3 count int Count

Board diversity (6 columns)

Column Display name Type Description
board_size Board size int Total number of editors on this journal's board
board_pct_female % female float Percentage of female editors on this board
board_country_count Countries on board int Number of distinct countries represented on the board
board_country_hhi Country HHI float Herfindahl-Hirschman Index of country concentration (0=diverse, 1=concentrated)
board_institution_count Institutions on board int Number of distinct institutions on the board
board_mean_h_index Mean board h-index float Average h-index of board members

Multi-board (3 columns)

Column Display name Type Description
boards_count Boards served int Number of editorial boards this editor serves on
publishers_count Publishers served int Number of publishers this editor serves across
is_multi_board Multi-board bool True if editor serves on 2+ boards

Metadata (4 columns)

Column Display name Type Description
name_script Script string Detected writing script of the name (Latin, CJK, Cyrillic, etc.)
name_script_region Script region string Geographic region associated with the name script
data_version Version string Dataset version identifier
enriched_at Enriched at datetime ISO timestamp of enrichment completion