[PROPOSAL #155][PASSED] Spacebox - Tool for Data Indexation and Storage in the Cosmos Hub

Spacebox - Tool for Data Indexation and Storage in the Cosmos Hub

Change log

  • 2023-02-10 Created initial post

IPFS link - Full Proposal IPFS

Introduction:

As the Cosmos ecosystem expands and evolves, the requirement for efficient and reliable data management solutions within the Cosmos Hub becomes increasingly important. Bro_n_Bro have expertise in working with indexers in the ecosystem and comprehend their utilization in building advanced applications. Despite the popularity of existing indexers for monitoring purposes, they lack the necessary flexibility and user-experience for developers and users. As more blockchains emerge, the need for more data storage becomes imperative. Although improving storage and data access at the SDK level would be ideal, for now, a separate indexer remains the best solution for quick access to data.

To address this requirement, we propose the development of a comprehensive set of open-source tools for data indexation and storage, utilizing the cutting-edge technology of ClickHouse as the foundation for the storage facility. This approach will not only provide quick access to large data sets but also ensure a stable architecture that guarantees data consistency and enables a lightweight setup. This will foster the creation of more sophisticated user-experience and analytics applications within the Cosmos Hub and across the entire Cosmos ecosystem.

Scope of Work:

  • Utilizing ClickHouse as the storage facility to provide quick access to large data sets
  • Developing a stable architecture that guarantees data consistency
  • An easily deployable set-up that will enable the development of more user-experience and analytics applications in the Cosmos Hub and the entire Cosmos ecosystem
  • Providing dev documentation, contribution, and set-up guides to assist in the understanding and execution of the project

Description:

The architecture of the proposed tool was formed for a purpose of scalability and data consistency.

To achieve named purposes process of crawling and writing to DB are separated into different microservices:

All services are packed into the docker to simplify deployment. The main repo contains a ready-to-use docker-compose file and deployment documentation.

All the above mentioned services are in the pre-Alpha version, and should not be considered as a final product.

Funding:

Development has already been started but to finalize it we are requesting the support of the Cosmos Hub community. 7,500 ATOMs would be enough to cover the costs of development and implementation of the Indexer that supports current Cosmos-SDK and IBC modules

The proposed project will take

  • soft date - 3 months
  • hard date - 6 months

Conclusion:

Today most cosmos applications are built either on raw chain data directly from API or based on closed-source indexators. Having open-sourced robust solution with easy and fast data access may drive great new applications development, and improve whole ecosystem growth.

Voting:

  • YES - by voting ‘Yes’ on this proposal, you indicate support for funding the creation of Space-box services
  • NO - by voting ‘No’ on this proposal, you do not support this proposal in its current form - please kindly indicate why by leaving comments in the Cosmos Forum.
  • NO WITH VETO - A ‘NoWithVeto’ vote indicates a proposal either (1) is deemed to be spam, i.e., irrelevant to Cosmos Hub, (2) disproportionately infringes on minority interests, or (3) violates or encourages violation of the rules of engagement as currently set out by Cosmos Hub governance. If the number of ‘NoWithVeto’ votes is greater than a third of total votes, the proposal is rejected and the deposits are burned.
  • ABSTAIN - You wish to contribute to quorum but you formally decline to vote either for or against the proposal.
1 Like

Sounds cool, but I’m wondering how this directly benefits the Hub.

Depending on the response to:

This seems like it might be more appropriate for an application to the recently passed ATOM accelerator DAO, especially as the team there may be able to make links between this tooling and teams that would be in need of such tooling to deliver a more complete offering that does directly benefit the ATOM economic zone.

It also opens the general question of:

  • Which proposals are more relevant to the scope of ATOM Accelerator DAO than the community pool and how might we navigate that?
    • I’d love to hear the takes of the ATOM accelerator DAO folks. I’ll cc @Youssef & @Better_Future on that one because it will be an interesting discussion going forward and the more thought that goes into it, the better.
1 Like

As a community-driven validator we understand the importance of ensuring that funds are allocated efficiently and effectively towards projects that will benefit the community. This is why we have created a community spend draft proposal for our already on-going project instead of seeking funding from the DAO (as it can be lengthy) and moreover with this proposal, we are giving the community the power to decide whether our project should receive funding or not. By allowing the community to make this decision, we are promoting transparency and accountability, and ensuring that the funds are being used in a way that is in line with the community’s goals and values.

1 Like

What is the sustainability plan for this product? if indexing changes after the funding cycle, does the tool get updated?

With most these grants I love to support them but the sustainability and maintenance of the product should be considered

Also numia has their tools open source and will be working on more tools for extracting data from nodes. The team is already indexing many chains, the data is open for anyone to query.
(we built this tooling without funding too)

2 Likes

Hey @Bro_n_Bro it’s always helpful if you update the forum tags to ensure that people participate in the conversation before the proposal goes on-chain. Would you mind updating it now to: [PROPOSAL #155] [ON-CHAIN] ?

Also, there are still some unanswered and important questions in this thread that voters will probably want to consider before voting on this proposal. It may be worthwhile to let people know your thoughts on those.

thanks Marko for your comment, could you please post links to the Numia open source tools? couldn’t find any, thanks in advance!

as we already build this tool we are committed to make an update or improve it any time until we validate the chain, also our repo is in open-source and we build it as a community tool so anyone could fork and improve it any time

thanks for your advice, changed the title already!

also we addressing any concerns regarding our proposal, if you have any questions feel free to ask

I can definitely see the benefits of such project.

It’s just tiresome to go props after props over the same common issue which is sustainability.

  • What is the plan after the project is ‘done’ (3 to 6months)? Is the ‘community’ supposed to take care of it? Should people work freely on the project after you got a fund allocated to first start it ?

  • You mentioned ClickHouse. It’s not a free service. Who’s gonna pay for it after the 3-6months period?

All in all it’s a great idea but I just don’t see how this is not a temporary gadget that will decay pretty quickly over time after a couple changes data sources/format or whatever.

Tho thanks for writing the prop I can now explore what numia offers.

PS: I am not sure how far you can go with numia but already after checking the doc you need to GCP account which after short free trial will have a cost.

to mention some features that Cosmos Hub can benefit from

  • fast applications that can be deployed on our free open-sourced indexer basis (there are a few open-sourced indexers but they are outdated)
  • booster for analytics (to present more deep and complicated analytics)
  • quick data access from node to get needed information

We are interested in building and using its services in the future and provide more complexed analytics for community/developers needs, also that’s WE who gonna be maintaining Indexer until we validate the chain. We build it as a community tool, and our main repo with Spacebox can be easily forked by anyone and can used for any needs. Using our Indexer will also boost other services in providing deeper analytics and allow faster responses on requests etc

With regards to the ClichHouse - its code is in open-source and you can use it at zero cost, so not sure what exactly you are speaking about here. No one needs to pay for the maintain, this is a one time request to complete the more complicated part of the Indexer (as it’s already in the work, all github links are attached above, maybe just to mention one more time README link

Also, numia code is NOT open-sourced, you have to pay using its services (correct if wrong), if you have open-sourced code of what is doing numia please feel free to post it here, we also gonna check and see what can be used if any

3 Likes

Thanks for answering.

I thought ClickHouse was a service that you had to pay for since the first thing you see on their website is Free Trial and Pricing.
But if you can use the source code directly then

I believe you are right numia does not seem to be open source.
Regarding cost you only have to pay per queries with GCP Big Query which would be on the ‘client’ side.

I definitely think what you are asking is fair and you’ll get a yes.

Good luck with the project!

2 Likes

Its a YES from Citizen Cosmos. Development should be open source in our opinion and tools, such as mentioned above, will be beneficial for several different entities on the hub

2 Likes

Indexing is in a sorry state right now, so more tools around this can only be good at this point.

We are actively looking for one we can use for our chain, and the first one to meet our requirements will be used and contributed back to.

I am not super worried about maintenance and support long-term, because if it is good and gets used by the community, there will be contributions (I believe). If it sucks and nobody uses it… Well, then it doesn’t matter :smiley:

Since I feel like there is a big gap in the tooling here, I will vote Yes for this.

1 Like

thanks for your support, step by step and we hopefully reach the quorum :slight_smile:

and yeah we agree, seems like there is a lack of understanding why is there a need in building a reliable Indexer in open-source, people are more worried about another funding proposal and not seeing the point of creating additional value for the ecosystem

FAQ post

Q: Why to not use The Graph as it moving towards Cosmos Hub? Why we need custom Indexer on Cosmos Hub?

A: Firstly, The Graph charges fees for indexing and querying data, which can be expensive for projects with large amounts of data. The cost of indexing historical data from block #1 of the Cosmos Hub, for example, could be substantial. In contrast, the Bro_n_Bro Indexer is an open-source solution with no associated fees and no need to sell your personality to test and play with it - a thoroughly decentralized spirit.
Secondly, while The Graph is moving towards supporting the Cosmos Hub (as announced in 2020), it may not provide the same level of flexibility and customization as a custom indexer. The Spacebox, for example, is designed explicitly for the Cosmos Hub and provides a comprehensive set of tools for data indexation and storage, with the ClickHouse as the storage facility. This approach provides quick access to large data sets, guarantees data consistency, and enables a lightweight setup. In other words, it doesn’t require builders to have HEAVY hardware setups to operate their instance of the Spacebox.
Finally, the Bro_n_Bro Indexer provides complete control over the data and indexing process, making it easier to customize the indexer to specific project requirements. In contrast, while The Graph is a powerful tool that provides a standardized API for querying data, it may provide less control and flexibility than a custom indexer.


Q: Why to not modify BDJuno?

A: We have been using BDJuno for more than two years for one of our projects, and honestly, it does not provide the best experience. With all the respect to Juno and BDJuno devs and supporters - it is not as reliable as it supposes to be: it is missing blocks, with almost no chance to track why, it is relatively slow, and it relies on Postgres DB, which does not work well with heavy aggregated queries when DB size passing ~150GB. Also, we attempted to modify BDJuno a while ago. There’s still a repo called osjuno to check. Considering all the above, we decided it might be more efficient to develop a new custom indexer from scratch, tailored to the specific requirements of the use case, as Bro_n_Bro is currently doing with the SpaceBox indexer.


Q: What kind of applications can be developed on custom Bro_n_Bro Indexer?

A: It can enable the development of a wide range of applications that require fast, reliable, and flexible access to the historical data of the Cosmos Hub, for example:

  • Analytics tools: Developers can use the indexer to create analytics tools that help users to analyze and visualize the data from the Cosmos Hub. These tools can be used for market analysis, trading, and other financial applications.

  • DApps: Decentralized applications can be built on top of the custom indexer that interacts with the Cosmos Hub. These DApps can offer new services and functionalities unavailable on the Cosmos Hub.

  • Wallets: Wallet developers can use the custom indexer to create more advanced wallet applications that offer features such as transaction history, account balances, and deeper user portfolio analysis.

  • Governance and voting: An application could use the indexer to track voting patterns and trends and provide real-time insights into the voting behavior of different stakeholders.

  • Data analysis: Researchers and data analysts can use the custom indexer to extract and analyze the data to gain insights and trends that are not currently available, do deep user behavioral analysis, and understand how certain things correlate with themself.Especially important assuming the upcoming ICS.

  • Identity and reputation: An application could use the indexer to track the on-chain activity of different users and assign reputation scores based on their behavior.

  • Trading and market analysis: A custom indexer can provide quick and easy access to historical trading data, allowing developers to build complex trading and market analysis tools


Q: Why to not apply for the Grant program? Why did you create a Community Pool Spend Proposal?

A: While the Cosmos Grant Program is an excellent initiative for funding and supporting projects in the Cosmos ecosystem, it has yet to be started, while we need support now. As a community-driven validator, we wanted the community to vote on and fund a proposal that they believe will benefit the ecosystem. It is a more democratic and decentralized approach, where the decision-making power is in the hands of the token holders.

In this case, we can see that most of the Voters from the Community voted YES (almost 14k) and less than 500 voted NO, while most validators voted Abstain. (at the moment of writing)

As validators, we are responsible for representing the interests of the community and delegators who have entrusted us with their stake. Bro_n_Bro strongly urges you to consider the will of the community and delegators in your decision to vote on this proposal.

This custom indexer will provide a reliable and efficient data management solution for the Cosmos Hub, facilitating the development of new applications and improving the overall user experience for the Cosmos ecosystem. We must support initiatives like this that contribute to the growth and development of our ecosystem.

Thank you for your attention!

P.S. We understand that several community-spend proposals are up for consideration, and prioritizing which proposals to support cannot be easy. However, we urge you to take a closer look at our proposal for the Spacebox Indexer. We genuinely believe that the community and delegators have already shown their support for our proposal.

4 Likes

Lately we had a discussion with a group of validators, and we are very grateful for the feedback/support received, we also see that due to a lack of detailed info a lot of vals had to vote Abstain on prop 155 in Cosmos Hub.

In light of this, we created a workshop video with the Spacebox accesing data in real-time showing the response speed and some other vanity features video is uploaded to IPFS, you can check it here

Thank you for your time, and we look forward to hearing your thoughts on our latest video!

Bro_n_Bro,

We generally strongly support development efforts. The demo you have shown appears to be useful to the general public.

The difficulty in voting yes is that we don’t have enough basis to verify that 7500 ATOMs is a reasonable amount.

What we can see on the surface is an ask for:
7500 ATOM * $12/ATOM = $90k for 3-6 months.

That would mean effort valued at $180k-$360k/year. Is this rate pretty normal for this kind of ask?

We don’t want to require it, but would you be willing to provide more details in regards to expected expenses or expected manpower allocation?

1 Like

Hey Chill,

Thank you for raising valid concerns regarding the cost and manpower allocation of our project.

To address your questions, we understand that the amount we are requesting (7500 ATOMs), may seem like a significant sum. However, we would like to assure you that this amount is based on our estimation of the expenses needed to complete the project in a timely and efficient manner.

The estimation you made is not precise, we do the one time funding request, so the numbers you provided do not reflect reality.

As for manpower allocation, we have a dedicated team of 2 GO developers, 1 DevOps, and 1 Data Analyst/Infrastructure. That’s the Team that is already working on the project and will continue to work for the duration of next 3-6 months.

Let’s breakdown the costs:

2 Go Devs - $10k per month
1 Data Analyst - $3.5k
1 DevOps - $2.5k
Infrastructure - $500

The total is $16.5k monthly, and the estimation here is more or less looks like $49.5k-99k for 3-6 months duration.

There is no limit to the length of the project, 3-6 months it’s the time in which we want to deliver the first production release, so the main goal of funding is to support this period of time. Bro_n_Bro will be maintaining the project, as beneficiaries of this product.

It’s important to keep in mind that further development and maintenance is an ongoing process that doesn’t end after the initial development period. After delivering the product, ongoing maintenance, bug fixes, updates, and improvements will likely be necessary to ensure the Spacebox Indexer remains functional and effective.

2 Likes

Forbole has voted abstain on Prop#155 as we have got a small grant from ICF for continuing the development of Juno back in 2021. It’s not appropriate for us to cast a non-abstain vote.

However, I do want to explore the possibilities of collaborating the development power into a single indexer instead of building a new one. After we have fulfilled the ICF grant, Forbole continues maintaining the project and make it generic while extensible enough for every Cosmos chain. It’s also well documented with monitoring capabilities. This is how BDJuno (Juno indexer for Big Dipper and the aforementioned osjuno were developed. We understand that there are drawbacks in PostgreSQL but it has been proven from the market over the years. The integration with Hasura has also been improving rapidly. There are also a lot of scaling solution for PostgreSQL which empower fast query with the lightweight Hasura. With the upcoming Cosmos SDK v0.47 release, Juno will perform a lot better as many bottleneck issues will be solved.

Juno and BDJuno are always opensource. We welcome any comments and contributions.

1 Like