TikTok is a short-form video app that has exploded in popularity over the past few years. Originally launched as Douyin in China in 2016, it was relaunched as TikTok the following year for markets outside of China. As of 2022, TikTok has over 1 billion monthly active users globally (https://appinventiv.com/blog/top-tiktok-statistics/). This massive user base generates enormous amounts of data through uploaded videos, user engagement, and collected user information.
The app allows users to create and share 15 to 60 second videos on any topic, often set to music and with creative editing effects. It utilizes an algorithm that learns user preferences to recommend an endless stream of personalized content. The addictive allure of this algorithm-driven content has led to TikTok’s meteoric growth.
With so many daily active users constantly uploading new videos and consuming content, the amount of data TikTok holds is staggering. From storage of video content to collection of user data for ad targeting, the scale of TikTok’s data universe helps explain why its parent company ByteDance is one of the most valuable startups in the world.
Massive User Base
TikTok has over 1 billion monthly active users globally. According to Statista, the number of TikTok users worldwide is expected to reach 955.3 million by 2025. This massive user base contributes to the large amount of content being uploaded and engaged with on the platform daily.
TikTok’s growth has been rapid, reaching over 2 billion downloads globally by April 2020 according to Oberlo. The app’s popularity has skyrocketed, especially among younger demographics. In 2021, TikTok surpassed Google as the most visited website in the world.
With so many millions of users creating, viewing, and engaging with content, it’s no wonder the amount of data and storage required by TikTok is substantial.
Hours of Video Uploaded
The number of videos uploaded and created on TikTok is absolutely staggering. According to research, about 500 hours of video are uploaded to TikTok every minute. That translates to approximately 30,000 hours of new video content created and uploaded to the platform every hour. In a single day, over 720,000 hours of TikTok videos are uploaded. To put that in perspective, it would take a single person over 80 years to watch all of the videos uploaded to TikTok in a single day.
The massive amount of user generated video content is likely one of the main reasons why TikTok’s data and storage requirements are so extensive. Storing and processing that much new video content on a daily basis necessitates huge investments in data infrastructure.
Video Storage Requirements
The high resolution and length of TikTok videos requires significant storage. Videos on TikTok can be up to 10 minutes long with a maximum file size of 500MB (source). This allows for very high resolution and high quality videos. At maximum length and quality, a single 10 minute TikTok video would require 500MB of storage. With over 1 billion monthly active users uploading millions of videos each day, this adds up to massive storage needs.
TikTok videos are recorded at a minimum resolution of 1080×1920 pixels (source). Higher resolution videos at 4K are also supported. Storing high resolution videos requires much more storage space compared to lower resolution videos. Additionally, TikTok stores multiple versions of each video to optimize playback across devices and connection speeds, further multiplying storage needs.
In summary, the lengthy and high resolution videos uploaded to TikTok on a daily basis require the company to have expansive cloud storage and infrastructure to support the app.
User Data Collection
TikTok collects an enormous amount of data on its users. According to Shelly Banjo in her podcast on TikTok’s data collection practices, “What user data does TikTok collect, and can that data be used against users?” (Source). TikTok gathers information on users’ locations, messages, interests, relationships, biometric data, and content viewing histories. The app monitors user activity and interactions to figure out each user’s preferences and interests. TikTok then uses this personal data for targeted advertising, content recommendations, and other purposes.
Ad Targeting Data
TikTok collects a significant amount of user data to enable precise ad targeting. According to their privacy policy, TikTok may share user data with advertisers and advertising partners to show users targeted ads based on their interests, demographics, behavior on and off TikTok, location, and other factors [1]. This includes data points like age, gender, interests, purchase history, and device identifiers.
For example, if a user frequently engages with cooking content, TikTok can infer they may be interested in cooking products and serve them relevant ads. Advertisers can also upload their own custom audience data to target specific groups of users, like existing customers. The granularity of TikTok’s ad targeting system requires storing immense amounts of data on each of its over 1 billion monthly active users [2].
Moderation Requirements
With over 1 billion monthly active users, TikTok generates an enormous amount of content that requires moderation[1]. In April 2023, TikTok CEO Shou Zi Chew revealed that the company employs “tens of thousands” of content moderators led by a team in Ireland to keep the platform safe[2]. Moderators review videos flagged by users or identified by automated systems as potentially violating community guidelines. Given the massive volume of new videos constantly being uploaded, TikTok needs a huge workforce dedicated to content moderation. One report indicated moderators each review as many as 1,000 videos per day, working grueling schedules for minimal pay[3]. TikTok’s reliance on a veritable army of human moderators contributes to the company’s expansive data storage needs.
Recommendation Algorithm Data
TikTok’s revolutionary recommendation algorithm is one of the key reasons for its massive popularity. The algorithm analyzes user behavior to determine preferences and recommend highly personalized, engaging content (Source). To function effectively, the algorithm requires enormous amounts of data.
When a user engages with a video, the algorithm notes details like watch time, likes, comments, shares, and more. It compiles this data across billions of users to understand interests and preferences. The more data it gathers, the better it becomes at recommending content. With over 1 billion monthly active users now on TikTok, the algorithm has amassed vast datasets for optimization (Source).
All this data on user behavior and interactions allows TikTok to serve hyper-personalized recommendations and keep users deeply engaged. However, storing and processing such massive amounts of data also requires substantial computing resources. The algorithm’s sophistication comes at the cost of large-scale data storage needs.
Multiple Versions of Each Video
TikTok stores multiple compressed versions of each video for optimized delivery across devices and network conditions. According to experts, TikTok transcodes each uploaded video into at least 3 different resolutions – 1080p, 720p and 480p (1). This allows TikTok to deliver the right video quality to match each user’s device capabilities and network bandwidth. The various resolutions also enable smooth playback as users scroll through their feeds. Without transcoding into multiple versions, videos would buffer excessively or play in low quality. As a result, TikTok’s infrastructure has to store these redundant copies, which multiplies their storage needs. The transcoding process is automated through TikTok’s backend video processing workflows.
(1) https://www.reddit.com/r/Tiktokhelp/comments/jkavrq/why_is_my_video_quality_so_bad_once_i_upload_a/
Conclusion
TikTok’s massive amount of user data and documents can be attributed to several key factors.
First, TikTok has an enormous global userbase, with over 1 billion monthly active users. This results in billions of videos being uploaded daily. Storing these videos requires massive amounts of data storage capacity.
Second, TikTok collects vast amounts of data about its users for ad targeting purposes. Information about users’ demographics, interests, behaviors, and device data are all collected.
Third, moderating the content uploaded requires large teams of human moderators and advanced AI systems, all of which produce and store data.
Additionally, TikTok’s recommendation algorithm analyzes massive datasets to serve users personalized content. Multiple versions of each video are stored to allow for different aspect ratios.
In summary, supporting TikTok’s extensive userbase with customized services results in the need to store and analyze enormous amounts of data.