Table of Contents
What is Openbridge?
Openbridge evolved from a company that connects teams to data to a company that helps them get things done with data. Openbridge strives to make sure our products and services work harder for you, your team, and your work. Ultimately, Openbridge has a simple goal: make data accessible to those who need it.
When we can help unlock the value of data faster, our customers can generate insights that drive tangible business. For us, that is a success.
Our approach centers on removing the technical complexities that prevent analytics, business, and executive teams from quickly developing actionable insights from data and sharing them with clients, employees, and partners.
Why work with the Openbridge?
We have built our platform and service offering to;
- improve team efficiency, effectiveness, and velocity by removing barriers to accessing & using data
- provide access to more sources of data quickly and consistently
- provide flexibility to use the analytics tools you need to use to get the job done
- develop insights & promote learning
- improve business impact & outcomes with better data access
We are highly confident that our platform, customizable solutions, services, and infrastructure are carefully constructed and well-aligned to drive accelerated business successes using data.
What does being "analytics-ready" mean?
Providing analytics-ready solutions means we roll up our sleeves and get to work aggregating, organizing, converting, routing, mapping, and storing data. We do this work, so your teams are not doing tedious data engineering and infrastructure. This includes ingesting from data sources, undertaking data modeling, data transformations, and data cleansing. Ensuring data is available and accessible for easy consumption by your enterprise is what we mean by an analytics-ready solution.
By working harder to deliver analytics-ready data, we can increase the velocity of exploratory analysis, visualizations, data science, or machine learning.
What is data ingestion?
Data ingestion describes the first stage of a data pipeline. At a high-level, a pipeline moves a resource from a low-value place to a place of high value. For example, a pipeline runs water from reservoirs (low-value location) to homes (high-value location). The same is true for a data pipeline. A data pipeline reflects the movement from data sources (or systems where data resides) to data consumers (or those who need data to undertake further processing, visualizations, transformations, routing, reporting, or statistical models). This data integration process is responsible for mobilizing data as part of a data pipeline. The efficient, automated data migration from source to a consumer is essential for unlocking silos.
What do I need to get started using Openbridge
Sign up for a 14-day trial, add a data source, a destination, and you are ready to go! Our support team can help out with any questions or bumps you hit along the way.
How does pricing work?
We offer several different pricing plans based on the number and type of data connectors you want to activate. We have detailed examples to you determine which plan is the right fit. In addition, we also provided a collection of pricing scenerios to put everything into context.
If our published plans do not fit your needs, you can reach out to us about a custom pricing plans tailored to your requirements. Pricing for Enterprise, or third-party integrations, vary according to each vendor's plans and policies. As such, there are no free trials for Enterprise plans. For more details, check out our pricing page.
Is there a free trial for Enterprise connectors?
Unfortuantely, variations in third-party contract terms and policies do not allow for free trials. Pricing for Enterprise, or third-party integrations, vary according to each vendor's contractual requirements. As such, we can not offer free trial periods for Enterprise connectors. For more details, check out our pricing page.
What is the difference between Standard, Premium, or Enterprise connectors?
We offer several integrations as "Standard" connectors. All plans include these types of connectors. Another tier of integration is "Premium" connectors. Premium connectors are more complex, advanced, and require additional development resources. "Premium" connectors are only available within Professional and Team plans.
Enterprise, or third-party integrations, are only available to those users with Enterprise plans. Pricing for third-party integrations in the Openbridge Marketplace varies according to each vendor's procedures and policies. For more details on Enterprise plans contact our sales team
What if I don't see a data source that I need in your list of connectors?
Talk to our team or submit a support ticket indicating the source you need so we can make sure it is routed to the correct team. The more requests we get for a data source, the higher we prioritize building these new connectors.
When will I be charged?
We offer a 14-day, no credit card required, free trial. You are not asked for a credit card before you start your trial, nor will you be charged during the trial. There are no charges unless you decide to order the service at the end of your trial. During this time, you have access to either standard and premium data integrations or destinations according to your selected plan. Before the end of the 14-day trial, you must switch to a paid plan to continue service. If you do not proceed with a paid plan, any integrations you set up are paused. The data we collect and delivered remain in your data lake or warehouse.
Enterprise customers get invoiced according to an agreed-upon billing schedule as outlined in your Enterprise Agreement.
Do you charge for a data lake or warehouse infrastructure?
No, there are no charges from Openbridge for your data lake or cloud warehouse. Any charges are directly billed by the data lake or warehouse provider (i.e., Amazon, Google, Microsoft, or Snowflake). They will directly bill you.
What is a data pipeline?
A data pipeline reflects a connection from a specific source resource to a target destination. For example, you have two Facebook pages you want to collect insights data from and deliver to BigQuery. Each page would reflect a unique resource, which would equal two data pipelines, one for each page ID.
Another example is Amazon Advertising. If you have 10 Amazon Advertising profiles that you want to deliver to Redshift, this would equal 10 data pipelines, one for each Amazon Advertising profile ID to a Redshift data Destinations. For a detailed example, check out our pricing page.
What is ETL or ELT?
ELT (Extract, Load, Transform) is variations of ETL (Extract, Transform, Load). With ELT data transformations occur once data is loaded to a data lake or warehouse. With the advancements of cloud data warehouses, like Amazon Redshift and Google BigQuery or data lakes using Amazon Redshift Spectrum, Amazon Athena, or Azure Data Lake, the use of ELT has seen increased adoption.
How do I load my data to a data lake or cloud warehouse?
Once you activate a connector your data is automatically collected, processed, and loaded to a target data destination. The initial process can take up to 24-48 hours after activating a pipeline. The rate at which data will sync will often occur daily or hourly, depending on the data source.
How does Openbridge deliver data pipelines?
Our mission is to provide analytics-ready data to free your business, IT, data science, and analysts from painful data wrangling work. By delivering analytics-ready data, we empower you to use an incredibly diverse array of tools to explore, analyze, and visualize so you can understand business performance quickly.
Our code-free, automated data pipeline route, process, and load to your data lake, warehouse, or data mart ("data destination").
Delivering code-free pipelines unlock the hidden potential of data for machine learning, business intelligence, data modeling, or online analytical processing.
Does Openbridge support data pipelines using databases as data sources?
Openbridge supports databases as data sources by employing Amazon’s Data Migration Service (DMS). The typical use case is to use DMS to explore data to an AWS S3 landing zone, ingest the data into a curated data lake, register them in a data catalog, and create corresponding tables/views in Amazon Redshift Spectrum and Amazon Athena. If you are interested in this service, please contact us.
Does Openbridge support custom connectors, batch processing, or streaming?
Yes, Openbridge supports batch data loads via batch processing service. Batch is perfect for large, bulk data transfers from internal on-premise services that you want to migrate to the cloud. We also support streaming data via our AWS Kinesis webhook service. Using an AWS Kinesis webhook extends the supported services to systems like Mailchimp, Shopify, and Zapier that can push data to webhook. Some customers may need a custom connector. If you do need something like this, please reach out to our team to discuss. It is not uncommon for us to develop custom or proprietary services based on customer requirements.
Does Openbridge retain copies of our data?
Openbridge does not retain your data. We store all the data we process directly to customer designated and owned data destinations.
Does Openbridge store my data long term?
No, Openbridge is a data pipeline. We do not persist your data. You own the data destinations we deliver to. Data in our system is transient, which relfects the time it takes for us to collect data from a source and deliver it to your target destination.
If a situation occurs that prevents data from loading to your data destination (a copy error, a transformation error, type conversion error, etc.), data not yet loaded to your data destination may be temporarily stored in our system as a backup until your destination errors are rectified. Normally destination errors are a function of a credential change or having our authorizations get reset without are system being updated.
How does Openbridge handle data retention?
The customer determines data retention policies. Since Openbridge does not host the data destinations, it is up to the customer to delete or archive data after specific periods.
What is a data destination?
Openbridge will collect, process, route, and load data to a target data destinations of your choosing. A target is a single data warehouse or data lake . This destination reflects a unique cluster, host, storage system, data set, or database. Openbridge allows you to connect multiple destinations based on your plan. Once a warehouse or data lake is registered within Openbridge, it counts as a destination. For example, an "Explorer" plan allows you to have five destinations. These can be five Google BigQuery destinations or a mix-and-match of BigQuery plus Amazon Redshift and Amazon Athena. Ultimately, the choice is yours. It is your data!
How does Openbridge treat target data destinations differently?
Openbridge optimizes for each data destinations we support. For example, each target destination has technical variations relating to data types, optimizations, naming restrictions/conventions, deduplication, and loads.
What queries does Openbridge run on my target data destination?
Other than loading and deduplication of data to your target data destination, we run only run queries that verify permissions and testing availability. We run queries needed for creating assets to materialize a data catalog for analytics, change control, and risk mitigation. See more on our data management approach.
What happens if the data lake or warehouse gets disconnected?
Openbridge is architected to prevent data loss or duplication. We buffer data once it's in the pipeline, so if a data lake or cloud warehouse gets disconnected, nothing is lost as long as it's reconnected before the buffer expires. Most customers have a seven day buffer; Enterprise customers can define buffer policies and expiration intervals.
Can I add multiple data lake or warehouse destinations on my Openbridge account?
Yes, you can attach multiple destinations within your account. For example, let's say you wanted to partition your client data into unique BigQuery destinations, one for each client. You can have various Google BigQuery destinations for each client. We also support a hybrid model attaching different technologies like Amazon Athena, Azure Data Lake, Redshift, or BigQuery. For example, one client team wants to use BigQuery and another Azure. Ultimately, you get to choose the destination a data source is delivered.
Why do I need a data lake or cloud warehouse?
We need someplace to store the data we collect! For security, privacy, and maximizing productivity, we deliver directly to a customer managed data destination. This means you always own your data.
Our job is to make sure your data need is organized, processed, and loaded into a data lake or warehouse. This data destination acts as a central repository of data aggregated from disparate sources. The data lake or warehouse becomes a hub that allows you to plug in your favorite tools so you can query, discover, visualize, or model that data. We go over the conversation for data lake vs data warehouse to help teams make information decisions.
Over time, the amount of data we collect can end up being millions or billions of records. Low-cost, high-performance cloud warehouses or data lakes are a perfect fit as a data destination for customer data. For example, did you know that Google BigQuery offers a free tier? Every month Google will give you a free terabyte of queries and you’ll be able to load your up to 10GB own data at no cost.
Do you offer consulting services?
Yes! Openbridge expert services offer data consulting, engineering, advisory, or subject matter expertise as needed by our customers. Need a PoC? Exploring the possibility of a data lake? Trying to pick a new BI platform? Expert services enable you to choose the level of engagement that's right for you — from full-service to self-service, and anything in between.
Openbridge can help you develop strategies to harness data that matters most, including how to architect the right solutions to support your business. Tapping into our data expertise provides hands-on knowledge, so technology solutions are focused on accelerating your data and insights efforts.
What are examples of consulting engagements?
Here is a sample of the types of consulting and advisory engagements;
- Solve fragmented visibility into critical customer insights because data is hidden in siloed marketing systems like Salesforce, DoubleClick, Google, Facebook, Adobe and many others
- Exploring how to data wrangle diverse social, mobile, web, marketing and media information in a cohesive “omni-channel” view of a customer to inform and drive action
- Looking to create a data analytics "proof-of-concept" to secure approvals and funding
- Exploring the use of new analytics, data visualization and reporting tools like Tableau, Mode Analytics, Qlik, Looker and others
- Selected tools like Tableau, Mode Analytics, Qlik or Looker and you want to accelerate adoption by fueling those tools with the data they need for you to realize value from your investments
- Transition from manual data processes and reporting to increase efficiency and the value you deliver to clients or internal teams.
How do I transform my data?
Openbridge is an Extract, Load, and Transform (ELT) platform with the "T" happening in external tools. While we focus on ELT, we also undertake ETL as well (see ELT vs ETL ).
Openbridge believes in increasing the velocity of data access, which means we focus on quickly delivering analytics-ready data. In most cases, additional transformations are not needed. As a result, we do not provide direct tooling or software to transform data. We suggest the use of best-in-class transform tools like Tableau Data Prep, Alteryx, or Trifacta, offer attractive and cost-effective solutions.
How do I access my data after you load it to a target destinations?
There are a few different options depending on what you want to do with the data. If you want to undertake data analysis it is not uncommon for customers to use tools like Grow, Tableau, Microsoft Power BI, or Looker.
Another option is programatically connecting to your data with ETL tools like Oracle, Azure, Talend, and others in that class. Using ETL tools is not uncommon in cases where further downstream proccessing, modeling, and transformations need to occur prior to importing into an EDW.
Lastly, you can write raw SQL queries against with SQL Workbench, command-line, scripts, and similar resources.
The three options above are not mutually exclusive of each other! Our goal is to provide standardized access, regardless of the data access use cases.
What do I do with the data once it is in my target data destination?
The data is yours, so you can do whatever you wish with it. What you do with your data once it is in your target data destination is up to you. As a result, you have the freedom to choose preferred visualizations or business intelligence solutions. Most of our customers utilize business intelligence and visualization tools, including Looker, Mode, Power BI, or Tableau.
We focus on being a simple, cost-effective data pipeline. The goal of our pipelines to deliver "analytics-ready" data.
Why is my data not available in my target data destination?
Openbridge is only as good as the data sources and destinations we connect. If a source has a delay, outage, or failure, then we will encounter a delay in data replication to your destination. If your target data destination is unavailable due to failed authorizations, firewalls, removal, or resource constraints, then there will be a delay (or outright failure) in data replication.
Does Openbridge offer an on-premise solution?
Openbridge is a cloud-based, hosted solution. We do offer various on-premise and non-SaaS services. If you are interested in learning more, contact us.
Is Openbridge resilient to changes in my data?
We know data schemas change, so our platform embraces this change. We've created a "set and forget" your data, which we call "zero administration." When setting up an integration, we handle the schema and definition mapping to your target data destination automatically. As source data changes, all corresponding schema changes are registered and updated automatically in your target destination.
What visualization, business intelligence, or reporting tools can I use?
You have a lot of great and affordable choices these days! Tools like Grow, Tableau, Microsoft Power BI, or Looker are just a few options to consider. Take a look at the full list of tools or read our DZone article about creating an objective 10-point business intelligence tool scorecard to help narrow the field. If you have questions, feel free to reach out to us.
Why can't I see any data?
If you just set up your account, it could take anywhere from a couple of hours to a couple of days to complete a sync depending on the data source. If it has been several days, please submit a support ticket, and we will look into it.
How does Openbridge handle changes in the source?
Our pipelines are configured to handle new fields or tables added to your source gracefully, so you don't need to make changes on your end. We continuously monitor and stay ahead of changes or deprecations, so you don't need to think about it.
What is the average data sync time for each source?
Most sources complete data sync in less than a day. However, the amount of data in the source and API rate limits may cause our ability to sync data to vary.
What are the costs for data lake or cloud warehouses?
Every data lake or cloud warehouse has its own pricing model. Pricing varies by usage, which is defined by the compute and storage consumed or provisioned.
Depending on your situation and requirements, different price-performance considerations may come into play. For example, if you need to start with a no or low-cost solution, Amazon Athena and Google Bigquery should be a consideration. Both services only charge according to usage.
On-demand usage pricing may provide you with the essentials to kickstart your efforts. If you have questions, feel free to reach out to us. We can offer some tips and best practices on how best to set up a data lake or cloud warehouse based on your needs.
Can I use standard SQL to access my data?
Yes, using standard SQL is supported. BigQuery, Amazon Redshift, Amazon Athena, Amazon Redshift Spectrum, and others support familiar SQL constructs. There may be some limitations or best practices for the specific use case, but the rule of thumb is that SQL is available.
How is data lake or warehouse data encrypted?
Most vendors encrypt data in transit and at rest. In transit, vendors support SSL-enabled connections between your client application and your data destination. At rest, vendors encrypt data using AES-256 or customer-defined methods.
How do I load data into my Google BigQuery or Amazon Redshift data warehouse?
You can load data into Google BigQuery or Amazon Redshift from a range of data sources like Amazon Seller Central, Google Ads, Facebook, YouTube, or from on-premises systems via batch import. Openbridge automates these data pipelines so you can ingest data into your data warehouse cluster code-free.
Does Redshift, BigQuery or Snowflake support querying a data lake?
Yes, Redshift, BigQuery, or Snowflake supports querying data in a lake. Data lakes are a compelling solution, and Redshift, BigQuery, or Snowflake allows you to query data in your data lake with our fully automated, data catalog, conversion, and partitioning service.
Do I need a services engagement for BigQuery, Redshift, Athena, or Spectrum?
Typically, expert consulting is not needed. Most customers are up and running using their preferred warehouse or data lake quickly. However, if you need support, we do offer data consulting. There may be situations where you have specific needs relating to your preferred solution. These situations can require expert assistance to tailor the system data to fit your requirements.
Ultimately, our mission is helping you get value from data, and this can often happen more quickly with the assistance of our passionate expert services team.
Do I need to authorize Openbridge access to my data lake or warehouse?
Yes, typically Amazon Athena, Google BigQuery, Amazon Redshift, and others require authorization to load your data. You would provide us with the correct authorizations, so we can properly connect to these systems. The process takes a few minutes to set up in your Openbridge account.
Do you follow best practices for data partitioning within data lakes?
Yes! Azure and AWS suggest the use of partitioning can help reduce the volume of data scanned per query, thereby improving performance and reducing cost.
You can restrict the amount of data scanned because partitions act as virtual columns. When you combine partitions with the use of columnar data formats like Apache Parquet, you are optimizing for best practices.
Do you optimize for AWS Athena queries to a data lake?
Yes! We follow Amazon's best practices relating to the file size of the objects we partition, split, and compress. Doing so ensures queries run more efficiently, and reading data can be parallelized because blocks of data are read sequentially.
This is true mostly for more substantial files as well as smaller files, generally less than 128 MB, that do not always realize the same performance benefits.
Redshift Spectrum vs Athena, which is better?
See our post on Redshift Spectrum vs Athena which details the decision making process.
Do you support compression and file splitting for data lakes?
Yes! Azure and AWS suggests compression and file splitting can have a significant impact on significantly speeding up Athena queries. The smaller data sizes mean optimized queries, and it reduces network traffic with data stored in Amazon S3 or Azure.
For example, when your data is splittable execution engines like Presto or AWS Athena can optimize the reading of a file to increase parallelism and reduce the amount of data scanned. In the case of an unsplittable file only a single reader can read the file. This only happens in the case of smaller files (generally less than 128 MB).
Do you support columnar data formats like Apache Parquet for a data lake?
Yes! Azure and AWS suggests the use of columnar data formats. We have chosen to use Apache Parquet vs. other columnar formats. Parquet stores data efficiently with column-wise compression, including different encoding and compression, based on the data type.
Openbridge automatically handles the conversion of data to Parquet format, saving you time and money, primarily when Athena executes queries that are ad hoc in nature. Also, using Parquet-formatted files means reading fewer bytes from Amazon S3, leading to better Athena query performance. See our post on Apache Parquet benefits.
How do I contact the Openbridge support team?
Submit a support ticket here.