Data Sources & Methodology

All data used on this site is drawn from public, authoritative sources. Below is a full accounting of every dataset, indicator, and how the data is processed.

Jump to Methodology

Economic & Demographic

World Bank Open Data

Primary source for country-level economic and demographic indicators, accessed via the World Bank API.

Indicator	Code
GDP (nominal, current USD)	NY.GDP.MKTP.CD
GDP per capita (current USD)	NY.GDP.PCAP.CD
Population	SP.POP.TOTL
CO2 emissions (Mt)	EN.GHG.CO2.MT.CE.AR5
Life expectancy at birth	SP.DYN.LE00.IN
Military expenditure (current USD)	MS.MIL.XPND.CD
Military expenditure (% of GDP)	MS.MIL.XPND.GD.ZS
Armed forces (% of labor force)	MS.MIL.XPND.ZS

REST Countries

Country metadata (capital, region, subregion, area) from REST Countries API.

UNDP Human Development Index

HDI values from the UNDP Human Development Report 2023-24, covering 200+ countries.

UN General Assembly Voting

Voting Records

Complete General Assembly voting data from the United Nations Digital Library, covering 5,694 resolutions across 202 countries.

Resolution Themes

Resolutions are classified into thematic categories (Nuclear & Disarmament, Palestine & Middle East, Human Rights, etc.) using UN subject headings and document metadata.

Voting Alignment

Pairwise agreement scores between countries are computed from GA voting patterns over the last 10 sessions.

UN Security Council

Membership History

Permanent and non-permanent member terms since 1946 from the UN Security Council.

Vetoes

All vetoes cast by P5 members since 1946 from the Dag Hammarskjöld Library.

Resolution Votes

Curated set of notable Security Council resolutions with individual member votes from the UN Security Council.

International Treaties

Party, signatory, and withdrawal status from the United Nations Treaty Collection for 12 major international treaties:

ICC Rome StatuteNPT (Non-Proliferation)Paris AgreementUNCLOS (Law of the Sea)CTBT (Test Ban)Cluster Munitions ConventionArms Trade TreatyOttawa Treaty (Mine Ban)Convention against TortureConvention on Rights of the ChildRefugee ConventionTPNW (Nuclear Prohibition)

Sanctions

14 active UN sanctions regimes with measures (arms embargo, travel ban, asset freeze) and target countries, from the UN Security Council Sanctions Committees.

Diplomatic Recognition

Recognition stances compiled from official government statements, UN records, and diplomatic sources for 4 disputed/contested entities:

KosovoPalestineTaiwanWestern Sahara (SADR)

Military & Defense

Military Expenditure

SIPRI Military Expenditure Database via World Bank — spending in current USD, as % of GDP, and armed forces as % of labor force.

Military Capabilities

Global Firepower 2025 — power index rankings, personnel strength, aircraft, armor, naval assets, and defense budgets for 145 countries.

Conflict Events

ACLED (Armed Conflict Location & Event Data) — aggregated conflict event counts and fatalities by type (battles, explosions, violence against civilians, protests, riots) for 2023–2025.

GDELT Events & Media Tone

GDELT Event Database

The GDELT Project monitors news media worldwide and codes events using the CAMEO coding system (Conflict and Mediation Event Observations). Events are classified into 20 root codes ranging from cooperative (codes 01-10: statements, appeals, cooperation, aid) to conflictual (codes 11-20: disapproval, threats, coercion, assault).

GDELT DOC API

Media tone data is sourced from the GDELT DOC 2.0 API, which provides timeline-based aggregate sentiment analysis from global news coverage over the past 12 months.

Goldstein Scale

Each CAMEO event type has an associated Goldstein Scale score ranging from -10 (most conflictual) to +10 (most cooperative). Average Goldstein scores per country reflect the overall cooperative vs. conflictual balance of events involving that country.

Data Period & Processing

Event data is sourced from GDELT v1 daily export CSVs. Each daily file is stream-parsed to extract actor country codes, event types, Goldstein scores, and tone values. Per-country and per-bilateral-pair aggregates are computed for event counts, cooperative/conflictual breakdowns, and average tone.

UN General Debate Speeches

GA High-Level General Debate

Full-text speeches from the UN General Assembly General Debate (High-Level Week), covering sessions 75–79 (2020–2024). PDF statements are downloaded and converted to text using pdf-parse.

Keyword Extraction

Keywords are extracted from each speech using frequency analysis of unigrams and bigrams, with common English stopwords and UN procedural language filtered out. The top 15 keywords per speech surface the most prominent topics discussed by each country's delegation.

Country Group Definitions

121 group definitions from the open-source worldcountrygroups package, drawn from official organization sources, UN, and World Bank classifications.

PyPI GitHub

Methodology

This section documents how raw data is collected, processed, aggregated, and presented throughout the site.

Data Collection & Freshness

Country-level indicators are fetched from the World Bank API for the date range 2018–2024. For each indicator and each country, only the most recent year with available data is retained. This means different countries may report data from different years depending on their reporting cadence to the World Bank.

World Bank data is fetched in batches of 30 countries per API call. If a batch fails (e.g., due to an unrecognized code), each country in that batch is retried individually to maximise coverage.

HDI values are bundled from the UNDP Human Development Report 2023–24 (reference year 2022) and updated when a new report is published.

Geographic metadata (capital, region, subregion, area) comes from the REST Countries API. Country names are sourced from REST Countries when available, with a fallback to the World Country Groups registry.

All data writes use an atomic write pattern (write to a temporary file, then rename) to prevent partial or corrupted data from being served.

Group-Level Aggregation

When statistics are displayed for a country group (e.g., G7, NATO), values are computed as the sum of all member countries that have data for that indicator. This applies to GDP, population, CO2 emissions, military expenditure, military personnel, equipment counts, and defense budgets.

Each group stat displays a coverage figure (e.g., “28 of 30 countries”) so users can assess completeness. Countries with missing data for a given indicator are excluded from the sum rather than treated as zero.

Per-capita and percentage indicators (GDP per capita, military % of GDP, HDI) are not summed across groups. They are shown only at the individual country level.

Country Rankings within Groups

On each country page, the rankings table shows how that country compares within each of its groups. For each metric (GDP, population, CO2), the table shows the country's absolute value, its rank within the group (1 = highest), and its percentage share of the group total.

UN Voting Alignment Scores

Pairwise alignment scores measure how similarly two countries vote in the UN General Assembly. The calculation works as follows:

Only votes from the last 10 GA sessions are considered.
For each resolution, the vote of Country A is compared to the vote of Country B. Possible votes are Yes (Y), No (N), and Abstain (A).
Non-voting / absent (X) records are excluded from the comparison entirely — they are neither agreements nor disagreements.
An exact match (Y-Y, N-N, or A-A) counts as agreement. Any mismatch (Y-N, Y-A, N-A) counts as disagreement.
The alignment score is: agreement_count / total_compared_votes, producing a value between 0 and 1.
A minimum of 10 commonly voted resolutions is required for a pair to be included. This filters out countries with insufficient overlap.

The “Most Aligned” and “Least Aligned” lists show the top 15 countries by this score.

Group Voting Cohesion

When viewing a group's voting record, cohesion is calculated per resolution. For each resolution, the majority vote among group members is determined, and cohesion equals the fraction of the group that voted with the majority. For example, if 10 members vote Yes and 4 vote No, cohesion for that resolution is 10/14 = 71.4%.

Resolution Theme Classification

Each UN General Assembly resolution is classified into thematic categories using keyword pattern matching against the resolution title and UN subject headings. The classification rules:

Each resolution is assigned a maximum of 2 themes.
Themes are checked in a fixed priority order (Nuclear & Disarmament first, Budget & Administration last).
A special “Country-Specific Situations” theme is applied only when the title matches patterns like “situation in [country]” or “question of [territory]” and fewer than 2 themes have already been assigned.
Resolutions that match no pattern are classified as “Other”.

The 16 thematic categories are:

Nuclear & DisarmamentPalestine & Middle EastHuman RightsColonialism & Self-DeterminationEconomic DevelopmentLDCs, LLDCs & SIDSEnvironment & SustainabilityRefugees & MigrationApartheid & South AfricaPeacekeeping & SecurityInternational LawOuter SpaceHealth & SocialInformation & CyberBudget & AdministrationCountry-Specific Situations

Military Capabilities & Defense Spending

Military data comes from two independent sources with different scope:

SIPRI / World Bank (160 countries): Official government-reported military expenditure in current USD, as a percentage of GDP, and armed forces as a share of the total labor force. This data follows the same most-recent-year selection as other World Bank indicators.
Global Firepower 2025 (145 countries): A composite military power index plus detailed hardware and personnel inventories (active/reserve personnel, aircraft, tanks, naval vessels, defense budgets).

At the group level, all military metrics are summed across members. The “Top Members by Power Index” list ranks members by the Global Firepower composite score (lower score = stronger military).

Note that the SIPRI defense budget and Global Firepower defense budget may differ for the same country, as they use different estimation methodologies and reference years.

Conflict Events & Intensity

Conflict data from ACLED covers the period 2023–2025. Events are pre-aggregated by country and broken down by type:

BattlesExplosions & Remote ViolenceViolence Against CiviliansProtestsRiotsStrategic Developments

Each country is assigned a conflict intensity level (high, medium, low, or none) based on total event counts and fatality figures. This classification is pre-computed in the source data rather than calculated dynamically.

At the group level, events and fatalities are summed, and the “Members with Active Conflict” list ranks affected members by total fatalities (descending).

Year-over-year trend charts on country pages show the fatality trajectory across the covered period, with bar heights proportional to the year with the highest fatality count.

GDELT Events & Media Tone

GDELT data is processed in two phases:

Event data: Daily GDELT v1 export CSVs are stream-parsed. Each event record provides actor country codes (CAMEO format), event root codes, Goldstein Scale scores, mention counts, and average tone. Events are classified as cooperative (CAMEO codes 01-10), conflictual (11-20), or neutral.
Media tone: For the top countries by event volume, the GDELT DOC API provides 12-month timeline tone data, which captures aggregate media sentiment weighted by article volume.

Cooperation ratio is computed as cooperative_events / (cooperative_events + conflictual_events). Neutral events are excluded from this ratio. A ratio above 0.5 indicates more cooperative than conflictual events.

Bilateral relationship data pairs countries by co-occurrence in GDELT events and computes per-pair cooperation ratios and tone averages. Only the top 10 partners by event volume are retained per country.

At the group level, tone is weighted by article volume, and cooperation ratio is computed from aggregate cooperative and conflictual event counts. Intra-group pairs show bilateral relationships where both actors are group members.

UN General Debate Speeches

Speech PDFs are downloaded from gadebate.un.org for each country and session. Text is extracted using pdf-parse (based on Mozilla PDF.js). PDFs that yield fewer than 100 characters of text (likely scanned images) are skipped.

Keyword extraction uses frequency-based analysis:

Text is lowercased and split into words (minimum 4 characters).
Common English stopwords and UN procedural language (~200 terms) are removed.
Unigram and bigram frequencies are counted. Bigrams appearing 3+ times are included.
The top 15 terms by frequency are retained as keywords.

Country-to-URL mapping uses a slug-based lookup built from the country registry, with fallback to parsing the ISO2 code from the PDF URL embedded in each country's page HTML.

Diplomatic Recognition

Recognition data tracks 4 disputed or contested entities: Kosovo, Palestine, Taiwan, and Western Sahara (SADR). For each entity, the dataset records:

The list of recognizing countries (by ISO3 code)
Countries that have withdrawn recognition
The entity's UN membership status (member, observer-state, non-member)
The date independence was declared

A country's stance toward each entity is determined by its presence in the recognizer or withdrawal lists. Countries not in either list are classified as “Does Not Recognize.”

Sanctions Regimes

Sanctions data covers 14 active UN Security Council sanctions regimes. Each regime record includes the establishing UNSC resolution, the date it was established, the types of measures imposed (arms embargo, travel ban, asset freeze, etc.), and the targeted countries.

A country is shown as “under sanctions” if it appears in the target list of one or more regimes. The measures displayed are specific to that regime, not aggregated across regimes.

Treaty Ratification

Treaty status is tracked for 12 major international treaties from the UN Treaty Collection. Each country's relationship to a treaty falls into one of four categories:

Party — has ratified or acceded to the treaty
Signatory Only — has signed but not ratified
Withdrawn — was a party but has since withdrawn
Not Party — has neither signed nor ratified

Limitations & Caveats

Data lag: World Bank indicators can lag 1–2 years behind the current date. The “most recent year” shown may differ across countries and across indicators.
Coverage gaps: Not all countries report all indicators. Small states, territories, and countries in conflict often have incomplete data. Coverage percentages are shown where applicable.
Theme classification: Resolution themes are assigned by keyword matching, not by expert review. Some resolutions may be misclassified or assigned to overly broad categories.
Alignment scores: Voting alignment measures co-voting behaviour, not diplomatic alignment. Two countries may vote similarly for different reasons, or diverge on votes while being close allies.
Conflict intensity: ACLED intensity levels are pre-computed aggregates. They do not capture the full context of each conflict (duration, scale of displacement, strategic significance).
Military data discrepancies: SIPRI and Global Firepower use different methodologies. Defense budgets may differ between sources for the same country and year.
Recognition data: Diplomatic recognition is fluid. States may have ambiguous or evolving positions that are not fully captured by a binary recognizes/does-not-recognize classification.
GDELT data: GDELT captures media-reported events, not all real-world events. Countries with more English-language media coverage are overrepresented. Tone scores reflect media framing, not objective reality. The cooperation ratio is a rough proxy and should not be interpreted as a measure of actual diplomatic relationships.

Last Updated

Data last refreshed: March 3, 2026

worldbank

March 2, 2026

restcountries

March 2, 2026

undp

March 2, 2026