Mythbusting: are downloads from 'AppleCoreMedia' mostly from Apple Podcasts?
· Updated May 7, 2021 · By James Cridland · 4 minutes to read
Some podcast hosts think that traffic from an AppleCoreMedia useragent should be lumped into the total numbers for Apple Podcasts. Is that true?
Apple Podcasts, Pocket Casts or Castro are all podcast apps on iOS. When they download a show, your server will see a 'user-agent’: something like
Podcasts/1530.3 CFNetwork/1220.1 Darwin/20.3.0
Castro 2021.2/1301 - for the apps Apple Podcasts, Pocket Casts or Castro.
All three of these apps don’t need to download a show. You can “stream”1 a podcast on them instead. The app will start downloading the file and start playing it as soon as it can: even while it’s continuing to download the file; and when you stream a podcast file, none of these apps show their proper user-agent. Instead, they all look similar to this:
AppleCoreMedia/184.108.40.206D52 (iPhone; U; CPU OS 14_4 like Mac OS X; en_us)
This is the user-agent of Apple’s AVKit library that each of these apps use to enable the magic of “streaming” - the capability to play a file as you’re still downloading it. Apple does not give an official way for the AVKit’s user-agent to be changed. So they all just say
AppleCoreMedia - and that’s all your podcast host can see.
We host the Podnews podcast ourselves.
We examine all user-agents requesting our RSS feed; but generate this RSS feed dynamically for each user agent. Using this open list of RSS user agents, we add a querystring for each audio request containing a 'slug’, a code, for the user agent that it’s from.
As an example, if our RSS feed is requested by a useragent of
TuneInRssParser/1.0 then we mark all requests for audio as, say,
This enables us to have a high degree of certainty that a request for audio for a file containing
_from=com.tunein is from a TuneIn client; and allows us to compare that information with the separate user agent given when downloading the file.
AppleCoreMedia. We used the following query, which only lists downloads of more than 60 seconds of audio (roughly 750,000 bytes of data).
SELECT COUNT(*) AS totalrequests,querystring FROM cloudfront_logs WHERE (uri LIKE '/audio/pod%' AND useragent LIKE '%AppleCoreMedia%' AND bytes>750000 AND method='GET' AND SUBSTR (uri,-3,1) = 'm') GROUP BY querystring ORDER BY totalrequests DESC
Through a similar query, we had a total of 205,714 podcast downloads in that time.
AppleCoreMedia downloads numbered a total of 19,254, or 9.3% of all downloads. (CSV, 1.8MB).
Of those downloads, the query above tells us that 6,352 (32%) were from an Apple Podcasts client. An additional 371 were from the Apple iTunes Store, which may be used within Apple iTunes on Windows PCs.
In total, we counted 50 separately identifiable clients using AppleCoreMedia as a user agent: including Pocket Casts, the Google Assistant iOS app, and Overcast. (PDF).
The most popular client using AppleCoreMedia is a special feed that we give Apple’s Siri News service. (“Hey, Siri, play the latest news from Podnews”), which plays a slightly different audio file. This traffic would not appear for a normal podcast not listed in Apple’s News Briefing service.
So, ignoring the Apple News Briefing traffic, 11,513 downloads we received were using AppleCoreMedia, of which 58% were identifiably from Apple Podcasts or iTunes.
Real, or Myth?
If a podcast provider is unable to correctly attribute AppleCoreMedia traffic, it isn’t correct to either lump them all into Apple Podcasts, or to remove them entirely from that service.
For our corrected data…
58% of AppleCoreMedia traffic is verifiably from Apple Podcasts. It is not correct to simply remove these downloads from Apple’s numbers: that would show Apple’s market share as 5.3% lower than reality.
42% of AppleCoreMedia traffic is, verifiably, not from Apple Podcasts. It isn’t correct to assign these numbers to Apple Podcasts. That would erroneously inflate Apple Podcast’s market share by 3.9%.
What your podcast host could do
Our data is not a typical podcast’s data.
However, dynamically rewriting daily RSS feeds cached on user-agent still achieves a cache hit-rate of 84% according to our data (as opposed to 99% when ignoring user-agent). Requests for RSS feeds are typically not subject to high bursts of traffic, unlike audio.
Edge caches for the audio can be set to ignore the
_from= query string.
This shouldn’t add any complexity to podcast hosting; so it should not be beyond any podcast host’s capabilities to correctly attribute AppleCoreMedia to the right app.
Of note: using this technique, our stats for Apr 22 have just 0.7% of downloads from an unknown app. How does your podcast host match up?
- If you’re an app developer, Omny Studio and Spreaker have worked together to produce this code and testing tool which fixes that, once and for all. It’s open source and available for anyone.
- A “stream” is better called a “progressive download”. In this article, we’re defining it as “playing a podcast while it’s still downloading”. In fact, no podcasts are served by “streaming”; but we’re using this phrase to simply show the difference between this and an ordinary download.
|James Cridland is the Editor of Podnews, a keynote speaker and consultant. He wrote his first podcast RSS feed in January 2005; and also launched the first live radio streaming app for mobile phones in the same year. He's worked in the audio industry since 1989.|