Where does the name AGOORA come from?¶
The agora was a central public place in ancient greek cities where people met to do politics, markets, artistic and spiritual things. Data and humans needs an equivalent place to meet and exchange.
What is SDM?¶
Spoud, the company behind AGOORA has taken a journey around event-driven data architectures in the last 4 years. Out of the various experiences products and product names derived. Nowadays we call the domain we are working in just SPOUD Data Management (SDM). Spoud Data Market (SDM).
What is Apache Kafka?¶
Kafka is one of the widely used event streaming platforms. We are not going to introduce it by this documentation but there is tons of resources out in the internet. https://kafka.apache.org/intro. Or you may check out the on-demand lessons from confluent: https://cloud.contentraven.com/confluent/self-userpackage?ctid=NzA%3D
Some further reading about kafka / confluent platform:
Is there support for other event streaming platforms than Apache Kafka?¶
Not yet out of the box. But the meta model is generic and supports other platforms. You can write your own agent to integrate whatever fits into the model. Contact us if you want to discuss further integrations.
How can a cloud SaaS like AGOORA observe my Apache Kafka cluster?¶
In order to do observations, statistics etc. we need components next to the Kafka cluster. These components are called agents. Agents get minimal access to read metadata or data on your cluster and report just the essential summaries to the cloud system.
What does the AGOORA SaaS store from my data store (excluding profiles)¶
The AGOORA (SDM) SaaS is fed with descriptive state-data from the transport/state agents.
This data consists of:
- Name of data store (e.g. Kafka topic name)
- Metadata e.g. Schemas from the Schema Registry
- Metrics (high level, like number of events per second)
- Identification of the underlying transport (e.g. Kafka broker address)
- Key / Value metadata depending on configuration
The user can enter additional information about ownership, descriptions, links etc. to the state. It's up to the user what to add.
There is no payload stored from the transport/state agent.
What does the AGOORA (SDM) SaaS store in profiles from my data store¶
The profiling is solved separately and optional from the state agent. The profiler service is responsible to transform the raw data samples from the agent into a html profile which can be presented to the user. The current implementation includes stats about the fields as well as a inferred schema and a hand full real samples.
Where do you store data?¶
We are using Amazon Webservices to run the SaaS.
Can I turn off the data samples?¶
Yes, of course. Data samples are optional. There are different options to control the reach of your data.
- The profiler components are optional, if there is a reason to not look into data you don't need to run it
- The profiler uses a dedicated user, which can be granted permissions to certain topics or denied to sensitive ones
- Samples can be disabled in the profiler report that way they won't show up in the AGOORA user interface
How are agents communicating to the SaaS?¶
We are using http/2 gRPC calls and streams. Usually with this and SSL intercepting proxies there are issues. You may add an exception to make the connections pass through.
Do you support other ways to deploy than Docker?¶
There is currently no other supported way than running in Docker. Contact us if this is an issue for you.
Can I run AGOORA on-premise?¶
Yes, we currently prefer the cloud deployment for simplicity reasons for us to manage deployments. But we designed the system to use components which are working on Docker or any orchestrated platform like Kubernetes. So in principle this would be possible. Let us know when you need on-premise deployment and we will workout a solution witch fits your needs.
What's the difference between kafka-profiler and profiler-service?¶
The purpose of the kafka-profiler is to grab some data samples in regular intervals and send them to the profiler-service. The profiler-service runs the python pandas profiler to create an html report and infer a schema of the data sample. This report and schema is sent back as a response to the profiler. The profiler is handling the kafka data sample extraction and upload of the reports to the cloud server, where we store it as a data profile for your topic. The profiler-service is technology agnostic and can be used with other data systems.
What permissions do the agents need to work with my Kafka?¶
See the documentation of the components: Kafka Agent