icews
icews copied to clipboard
Some text fields include quotes, e.g. ""Fight"" instead of "Fight"
Some of the text field values include outer double-quotes in their value, e.g.:
> query_icews("select * from events where event_id = 25326166;")
event_id event_date source_name source_sectors
1 25326166 20170101 Women (Turkey) "Social,General Population / Civilian / Social"
source_country event_text cameo_code
1 Turkey "Conduct suicide, car, or other non-military bombing" <NA>
intensity target_name target_sectors target_country story_id sentence_number
1 -10 Turkey NULL Turkey 43113964 6
publisher city district province country latitude longitude year
1 Associated Press Newswires Ankara NULL Ankara Turkey 39.9199 32.8543 2017
yearmonth source_file
1 201701 Events.2017.20201006.tab
The "source_sectors" and "event_text" values include quotes...they shouldn't. This is the proper format:
> query_icews("select * from events limit 1;")
event_id event_date source_name
1 926685 19950101 Extremist (Russia)
source_sectors source_country event_text
1 Radicals / Extremists / Fundamentalists,Dissident Russian Federation Praise or endorse
cameo_code intensity target_name target_sectors
1 051 3.4 Boris Yeltsin Elite,Executive,Executive Office,Government
target_country story_id sentence_number publisher city district province
1 Russian Federation 28235806 5 The Toronto Star Moscow <NA> Moskva
country latitude longitude year yearmonth source_file
1 Russian Federation 55.7522 37.6156 1995 199501 events.1995.20150313082510.tab
Is this in the raw data files or package error?
Some of these are from Events.2017, and I manually checked to verify that these quotes are indeed present in the tab delimited raw data files.

What files are affected?
Check a couple of the fields to see what source file(s) these are coming from:
"event_text"
> query_icews("select distinct(source_file), count(*) as N from events where event_text like '\"%' group by source_file;")
source_file N
1 Events.2017.20201006.tab 59000
"source_sectors"
> query_icews("select distinct(source_file), count(*) as N from events where source_sectors like '\"%' group by source_file;")
source_file N
1 Events.2017.20201006.tab 512730
"target_sectors"
> query_icews("select distinct(source_file), count(*) as N from events where target_sectors like '\"%' group by source_file;")
source_file N
1 Events.2017.20201006.tab 425417
Of course. "Events.2017....tab"