Main Unknowns


Who are the biggest buyers and sellers of data?

See Service Providers for an initial idea. Potential partners also lists some large market players.

Next steps

  • Deeper research on the largest market players.
  • Understand the regulatory risks they face so we can avoid them.

How are these data collected today, and how will that change?


Next steps

  • Surveying and mitigating current and future legal boundaries/parameters, especially around data protection coming from heterogeneous rule setters such as national (Germany) and supra-national bodies (e.g. the European parliament). For an overview have a look here.

How are these data priced and sold?

See Data Brokers and The Value of Data for some information. Stefan tells us a rough estimate is that each data point purchased from a DMP will add 10% to CPM costs.

Next steps

  • Pricing Dimensions: Get a better notion of data pricing across categories, demographics, source and reliability

How are these data used?

Next steps

  • Build a visual explanation of how the protocol and its integrators would fit into the current stack
  • Experience DSPs, SSPs and DMP products hands-on

How do buyers ascertain data accuracy?

See related assumption: Higher data accuracy is relevant.
They don't, because they can't. This explains what we've heard repeatedly (e.g. Aurel) that data purchasing is always a gamble — buyers care only about ROI (not accuracy itself), and will buy a small volume first to test the waters.
Providing a verification layer could be a good USP for brand marketing scenarios; see How relevant is data accuracy? for more information.

How is ownership over these data protected?

It's possible there are no good solutions here due to the ease of copying and modifying digital data. If that's the case, we expect to find several inefficiencies in data markets stemming from overbearing due diligence and verification processes.
If we want to build a large, liquid, permissionless market, we likely need to reassure data producers that we can properly deal with attribution and payment.
See Ocean Protocol for an overview of their approach.

Next steps

  • Identify best current practices to prevent and detect unauthorized data consumption and resale (e.g. do data producers employ stenographic and stochastic watermarking techniques?)
  • Find a solution for our product that, at the very least, meets the current bar (see also Ensuring attribution and payment)

Fractal Protocol

What use cases will we support at launch?

We're almost certainly targeting the online advertising market, projected to grow to $400B in 2021. See Potential business models for some examples.

Next steps

  • Understand which alternative data consumers exist (or could be brought into existence) and their market size (e.g. identity verification, credit scoring, medical research).
  • Establish the KPIs we should measure ourselves by.
  • Provide the best options to capture this value, adhering to ethical standards and predefined principles.

How will we grow the pool?

We are assuming that we can change the market with the right incentives (offering data silos a better deal), and that we can significantly broaden the data market (making new business models possible by unlocking permissionless data liquidity).

Next steps

  • Design a token economy that incentivizes data producers to join our pool, and provide mechanism to curate, enrich, validate data accuracy, data yield etc.
  • Identify "one-click" solutions for data producers to plug into our pool and start generating cash flows.
  • Identify and make it easy to onboard large data producers.

How will we sell data?

A better framing might be: can we balance data privacy and availability? In other words, how do we maintain a decentralized data pool online?
We imagine 2 ways that data can be shared in advertising. The first one is online, during an ad request. A user visits a website they choose to share data with, and the website makes use of these data to deliver better content (through internal customization) and ads (through enriching the ad request). This dovetails neatly with the Data Wallet and is what we've been discussing.
However, this limits data availability to transactional online interactions. Should we focus complementarily, or alternatively, on supporting an offline market? This could be easily done with a centralized pool, but that's not only off-brand but also a ticking time bomb of a honeypot. Alternative solutions might resemble Sovrin's Agents, Solid's Pods, or PolyPoly's polyPods, where an always-online robot deals with data requests according to preferences set by the user.

Next steps

  • Deeper research into our data storage alternatives
  • Explore pricing mechanisms (Aurel is already looking into some of this).

Who is our current and upcoming competition?

What infrastructure do we use for our Substrate chain?

Once we have a Substrate chain built, we have several options for deployment: running it in isolation with our own validator network, or as a parachain or parathread on Polkadot or Kusama.
Running it in isolation affords us more throughput and fees, and sidesteps the potentially prohibitive parachain auction or parathread execution costs. As a parachain or parathread, we sidestep the overhead of recruiting and incentivizing a validator set, access cross-chain communication for easier trustless integration with other chains, and possibly gain legitimacy in the community.
Independent substrate chain
Substrate parachain
Substrate parathread
once available in Polkadot/Kusama
needs own validators
Trustless interoperability with other chains
Transaction costs
Setup costs
recruiting and incentivizing validators
It's also possible to develop and deploy smart contracts on Substrate chains that support them, e.g. Moonbeam, Edgeware or Plasm. This option is likely to afford us too little control to be attractive, so we're not considering it at the moment.
See Polkadot's deployment paths for more information.

Parachain auctions

Parachain slots aren't sold, but leased for a limited period. These leases are awarded via an auction mechanism.
The auction mechanism was successfully tested in the Rococo testnet by several test parachains, such as Acala's and KILT's. Rococo also saw successful usage of XCMP (trustless cross-chain communication).
The team has been silent on this, but Kusama slot auctions are expected to start soon:
When we’re confident testnet parachains are running smoothly and the code has been fully audited and benchmarked, a vote to enable parachains and slot auctions on Kusama will be submitted via on-chain governance

PLOs (Parachain Lease Offerings)

Parachain auctions can be financed through a native crowdloan functionality referred to as a Parachain Lease Offering. See Crust Network's and Equilibrium's medium posts about their PLOs.

Next steps

  • Monitor upcoming auctions and track volume and going prices
Last modified 7mo ago