Whistle-blowing site Wikileaks on Thursday released the Syria Files, a database of over 2.4 million emails to and from Syrian political figures, ministries and associated companies, dating from August 2006 to March 2012.
The 2,434,899 emails were gathered from 680 Syrian-related entities and domains, project analyst Sarah Harrison said during a news conference in London. The correspondence includes information from the Syrian Ministries of Presidential Affairs, Foreign Affairs, Finance, Information and Transport and Culture, she said.
"The Syria Files shine a light on the inner workings of the Syrian government and economy, but they also reveal how the West and Western companies say one thing and do another," Harrison said. The database contains information from about 679,000 email addresses that have sent emails to more than 1 million recipients. The number of documents is more than eight times that in the "Cablegate" file of U.S. diplomatic cables leaked via the site, with more than 100 times the volume of data, she said.
Harrison did not say how Wikileaks obtained the information from so many disparate sources.
To handle the volume of data in the Syria files, WikiLeaks built a general-purpose, multi-language political data-mining system that can handle massive databases like those represented by the Syria Files, Harrison said. The emails are in different languages and for instance include about 400,000 emails in Arabic and 68,000 in Russian, she said. Wikileaks provides English, German, Spanish and French translation. Wikileaks is working on additional features to enhance the system, Harrison added.
Because the collection of email is so large, it was not possible to verify every single email at once, said Harrison. But the organization is "statistically confident" that the vast majority of the data are what they purport to be, she added.
The first dump of email involves an integrator of digital radio systems, Selex SI, and concerns the sale and support of TETRA encrypted digital radios and base stations, which are typically used by police forces. "The database demonstrates that the selling, the assistance and training by Selex continued through to this year," said Harrison. Wikileaks provided links to the emails in text format on its website, and also a link to a torrent repository of the messages.
Wikileaks founder Julian Assange didn't attend the news conference because he is trying to avoid extradition from the U.K. to Sweden by seeking political asylum at Ecuador's embassy in London. However, Harrison read a statement from Assange in which he called the material embarrassing for Syria and its opponents and expressed the hope that conflicts such as those in Syria can be resolved through understanding.
Assange was arrested in London on Dec. 7, 2010 and placed under house arrest because Sweden had issued a European Arrest Warrant seeking his extradition for questioning about alleged sexual offenses. The U.K. Supreme Court ruled in May that Assange could be extradited.
Sign up for Computerworld eNewsletters.