Child Pornography and the Internet: A Data Analysis and Visualization Project

By The SumAll Foundation


The following is a study of the impact of the Internet on the spread of child pornography.

Over several months The SumAll Foundation team analyzed large amounts of data with the intent of understanding child pornography consumption and distribution habits. We then synthesized the data with cultural and academic research.

As a disclaimer, we’d like to state that our analysis is a series of forecasts and predictions based on the available data, not absolute statements of facts.

The SumAll Foundation recently analyzed traffic data related to the spread of child pornography on the web.What we found was interesting, enlightening, and a little scary.

We discovered that the Internet is driving both child pornography production and consumption, and that neither the producers nor the consumers adhere to typical stereotypes.

Broadly, we were able to create a profile of the behavior of a child pornography consumer. Using source data, we were able to determine the times of day, devices, and browsers associated with child pornography consumers.

We used a mixtureof data provided by partners with media and expert analysis to create models and forecast a reasonable profile of the problem.

While our studyis by no means a condemnation of pornography,

we also found indications that patterns of psychological addiction and dopamine release typically associated with pornography consumption appear correlated with the consumption of child pornography.

While it may seem obvious at first blush, the digital age has provided easier and more economical access to pornography. The societal taboo associated with walking into an adult bookstore is less relevant as consumption now occurs behind closed doors at virtually no cost.

Content is shared and socialized via anonymized communities and networks. As the consumption of pornography content expands, the prototypical consumer becomes increasingly diverse, shifting from the stereotype of the ‘dirty old man’ into a more mainstream demographic.

We discovered that the market for and dissemination of child pornography is large and growing, and that this growth is driven by the Internet.

The consumption of child pornography has the potential to perpetually harm victims.

Speaking general terms,consumers of child pornography can be broken down into three supergroups:

pedophiles, the sexual omnivores, and the sexually curious.

Pedophiles represent a small percentage of the child pornography consumption community, and their behavior may not be deterrable. Interestingly, we also discovered that true pedophiles are more likely to be left-handed, indicating a potential genetic link.

Sexual omnivores. This consumer group typically consumes a breadth of pornographic content, including child pornography.

The sexually curious. This group is reasonably new to and is still investigating pornography. This group has the potential to develop consumptive habits, if exposed to child pornography early.

The SumAll Foundation leveraged the engineering resources of our sister company,, to draw strong inferences about the digital spread of child pornography content and user behavior patterns.

We’re proud of our work, though do want to emphasize that our statistics are projections and probabilistic forecasts,

not hard and deterministic statements of fact. With a specific emphasis on search-related data, our data scientists made connections and discovered patterns.

Our analytics team dissected large files, then assisted our research team to extrapolate the most interesting data points. The research team then drafted a narrative and created human-readable charts and graphs.

Towards the end of our research we correlated our conclusions with pre-existing reporting and research related to the digital growth of the child pornography market. We found that our data mapped to the contemporary media narrative.

While some of our methodology and data sources must remain discreet, we can share that our methodology was in-line with data analysis best practices. Additionally our sources were diverse, specific, and knowledgeable. We also cross-checked our findings against publicly-available tools like Google Analytics and Trends, large data sets on Amazon Web Services, US Census data, and academic research.

Spending several weeks studyingthe data behind child pornography had an interesting impact on our team. During our research, information designer Saskia Ketz remarked, “[that] the consumer is more or less an average guy who kind of clicks into darker and darker content was pretty shocking.”

While we thankfully avoided the subject matter directly, our team members collectively spent hundreds of hours exposed to keywords and other data and media related to child pornography. It’s nigh impossible to study such content without experiencing side-effects.

“The keywords, sites, and file names were pretty creepy, especially when you consider how average a child porn surfer is,” said fellow researcher Xiu-Jing Shi.

We elected to study child pornography because a tremendous amount of relevant data is available, though the impact of the Internet on the growth of child pornography remains an underreported subject. While topics like child pornography are often unpleasant to examine, The SumAll Foundation believes that we can and should have a positive cultural impact by illuminating and interpreting data around relevant cultural topics.

The essence of The SumAll Foundationis to use data analysis for social good.

Our process involves working with interesting partners to apply’s formidable engineering resources to the analysis of large data sets. We assist partners by providing much-needed digital resources, analytical tools, and the human expertise.

We also hope to help inform the general public about the broad impact of big data on culture, art, and policy.

No project is sans human bias, yet we strive for editorial excellence, and we hope our work reflects our values; we rely on solid data to forecast reliable conclusions.

While we take the discretion and privacy of our sources very seriously, transparency is key to the success of our organization. If you have any questions about our process or people, please feel free to contact us any time.



Project Info:

  • Data Analysis: The SumAll Foundation Team
  • Information Design: Saskia Ketz
  • Researcher: Xiu-Jing Shi
  • Written By: Dan Patterson