Frequently Asked Questions

Discovering solutions, exploring options, uncovering insights or finding clarity.

Pricing

Issues related to payments or invoicing.

7 answers

For more information on our pricing and plans, see this article How Pricing and Billing Work.

You can access this information via the customer billing portal.

Once a plan is selected, you pay Openbridge the applicable plan charges described in an Order Form. In the event you want to upgrade to a different plan, Openbridge prorates the remainder of the current plan with the charges of the new plan. The difference between the current cost and the prorated amount is identified on the Order form. Unless detailed otherwise, all fees are non-cancelable and non-refundable.

We offer a 30-day, no credit card required, free trial. You are not asked for a credit card before you start your trial, nor will you be charged during the trial. There are no charges unless you decide to order the service at the end of your trial. During this time, you have access to either standard and premium data integrations or destinations according to your selected plan. Before the end of the 30-day trial, you must switch to a paid plan to continue service. If you do not proceed with a paid plan, any integrations you set up are paused. The data we collect and delivered remain in your data lake or warehouse.

Enterprise customers get invoiced according to an agreed-upon billing schedule as outlined in your Enterprise Agreement.

We believe in making pricing transparent and straightforward. Most plans detail our offering. If you are interested in a customized plan estimate, reach out to our data experts. Our Team reviews your requirements and provides you with the appropriate estimate based on your situation.

Data Destination

Issues related to Data Destination.

24 answers

Openbridge collects, processes, routes, and loads data to a target destination of your choosing. A target is a single data warehouse or data lake that reflects a unique cluster, host, data set, or database. Openbridge allows you to connect to multiple destinations based on your plan. Once a warehouse or data lake is registered within Openbridge, it counts as a destination.

Maximizing productivity means your data needs to be organized, processed, and loaded into a data lake or warehouse. This data lake or warehouse acts as a central repository of data aggregated from disparate sources. The data lake or warehouse becomes the hub that allows you to plug in your favorite tools so you can query, discover, visualize or model that data.

No, there are no charges from Openbridge for your warehouse. Any charges are directly billed by the warehouse provider (i.e., Amazon, Google, Microsoft, or Snowflake) to you.

Every data lake or cloud warehouse has its pricing model. Pricing varies by usage, which is defined by the compute and storage consumed or provisioned. Depending on your situation and requirements, different price-performance considerations may come into play. For example, if you need to start with a no or low-cost solution, Amazon Athena and Google BigQuery are the only charges according to usage. On-demand usage pricing may provide you with the essentials to kickstart your efforts. If you have questions, feel free to reach out to us. We can offer some tips and best practices on how best to set up a data lake or warehouse based on your needs.

When it comes to building your data strategy and architecture, it's essential to understand which data lake or warehouses should be candidates for consideration. Typically, teams will be asking themselves answers like "How do I install and configure a data warehouse?" "Which data warehouse solution will help me to get the fastest query times?" or "Which of my analytics tools are supported?"

This article covers the key features and benefits of five widely used data lake and warehouse solutions supported by Openbridge to help you choose the right one: How to Choose a Data Warehouse Solution that Fits Your Needs. If you have answers, feel free to reach out to us.

Yes, typically, Amazon Athena, Google BigQuery, Amazon Redshift, and others require authorization to load your data. You would provide us with the correct authorizations, so we can properly connect to these systems. The process takes a few minutes to set up in your Openbridge account. You can read about how to set up AWS Athena, Redshift Spectrum, Redshift, and Google BigQuery.

The data is yours, so you can do whatever you wish with it! What you do with your data once it is in your target data destination is up to you. As a result, you have the freedom to choose preferred visualizations or business intelligence solutions. Most of our customers utilize business intelligence and visualization tools, including Looker, Mode, Power BI, or Tableau.

We focus on being a simple, cost-effective data pipeline. The goal of our pipelines is to deliver "analytics-ready" data.

Openbridge is only as good as the data sources and destinations we connect. If a source has a delay, outage, or failure, then there is a delay in data replication to your destination. If your target data destination is unavailable due to failed authorizations, firewalls, it has been removed or is extremely busy, then there is a delay (or outright failure) in data replication.

Openbridge is a cloud-based, hosted solution.

Openbridge optimizes for each data destination we support. For example, each target destination has variations with specific data types, naming restrictions/conventions, deduplication, and loads.

Other than loading and deduplication of data to your target data destination, we run only run queries that verify permissions and create a data catalog for change control and risk mitigation.

Openbridge is architected to prevent data loss or duplication. We buffer data once it's in the pipeline, so if a data warehouse gets disconnected, nothing is lost as long as it's reconnected before the buffer expires. Most customers have a two-week buffer; Enterprise customers can define custom data retention policies and expiration intervals.

Yes, you can attach multiple destinations to your account. For example, let's say you wanted to partition your client data into unique BigQuery destinations, one for each client. You can have various Google BigQuery destinations. We also support a hybrid model attaching different technologies like Amazon Athena, Redshift, or BigQuery. Ultimately, you get to choose the destination a data source is delivered.

If you just set up your trial, it could take anywhere from a couple of hours to a couple of days to complete the historical sync depending on the size of your data source. If it has been several days, please submit a support ticket , and we will look into it.

Most vendors encrypt data in transit and at rest. In transit, vendors support SSL-enabled connections between your client application and your data destination. At rest, vendors encrypt data using AES-256 or customer-defined methods.

You can load data into Amazon Redshift from a range of data sources like Amazon Seller Central, Google Ads, Facebook, YouTube, or from on-premises systems via batch import. Openbridge automates these data pipelines so you can ingest data into your data warehouse cluster code-free.

Yes, Redshift supports querying data in a lake via Redshift Spectrum. Data lakes are the future, and Amazon Redshift Spectrum allows you to query data in your data lake without a fully automated data catalog, conversion, and partitioning service.

Typically, expert or consulting is not needed for Amazon Redshift. Most customers are up and running using their Amazon Redshift data quickly. However, if you need support, we do offer expert services. There may be situations where you have specific needs relating to Amazon Redshift data. These situations can require expert assistance to tailor Amazon Redshift data to fit your requirements.

Ultimately, our mission is to help you get value from data, and this can often happen more quickly with the assistance of our passionate expert services team.

Yes! Amazon suggests that the use of partitioning can help reduce the volume of data scanned per query, thereby improving performance and reducing cost. You can restrict the amount of data scanned because partitions act as virtual columns. When you combine partitions with the use of columnar data formats like Apache Parquet, you are optimizing for best practices.

Yes! We follow best practices relating to the file size of the objects we partition, split, and compress. Doing so ensures queries run more efficiently, and reading data can be parallelized because blocks of data are read sequentially. This is true mostly for more substantial files as well as smaller files, generally less than 128 MB, that do not always realize the same performance benefits.

Yes! Amazon suggests compression and file splitting can have a significant impact on significantly speeding up Athena queries. The smaller data sizes mean optimized queries, and it reduces network traffic with data stored in Amazon S3 to Athena.

When your data is splittable, Openbridge does this Athena optimization for you. This allows the execution engine in Athena to optimize the reading of a file to increase parallelism and reduce the amount of data scanned. In the case of an unsplittable file, then only a single reader can read the file. This only happens in the case of smaller files (generally less than 128 MB).

Yes! Amazon suggests the use of columnar data formats. We have chosen to use Apache Parquet vs. other columnar formats. Parquet stores data efficiently with column-wise compression, including different encoding and compression, based on the data type.

Openbridge automatically handles the conversion of data to Parquet format, saving you time and money, primarily when Athena executes queries that are ad hoc in nature. Also, using Parquet-formatted files means reading fewer bytes from Amazon S3, leading to better Athena query performance.

Yes, Google provides a collection of client libraries in their SDK package. You can download the command-line tools for Google Cloud Platform products and services here.

We have also bundled the Google SDK with Docker for a "run anywhere" solution. This includes a set of services that can export and import operations for BigQuery. Get the pre-built Docker image with Google Cloud SDK.

Yes, BigQuery provides a per-user cache at no charge. If your data doesn't change, the results of your queries are automatically cached for 24 hours.

Data Sources

Issues related to Data Sources.

13 answers

No, Openbridge is a data pipeline. We do not persist in your data. Your data in our system is transient, which means the time it takes for us to collect data from a source and deliver it to a target destination.

If a situation occurs that prevents data from loading to your data destination (a copy error, a transformation error, type conversion error, etc.), the events that have not yet replicated to your data destination is temporarily stored in our system as a backup until destination errors are rectified.

We offer several integrations as "standard" connectors. All plans include these types of connectors. Another tier of integration is "Premium" connectors. Premium connectors often are more complex, advanced, and require additional development resources. "Premium" connectors are only available within Professional, and Business plans.

Enterprise or third-party integrations are only available to those users with Enterprise plans. Pricing for third-party integrations varies according to each vendor's procedures and policies.

We offer several data lakes or warehouse solutions as "Standard" destinations. All plans include these types of destinations. Another tier of destination is "Premium." Premium destinations are only available within specific plans.

  • Standard — Amazon Redshift, Google BigQuery, Snowflake, Amazon Athena, Amazon Redshift Spectrum
  • Premium — Databricks, Delta Lake

We know data schemas change, so our platform embraces this change. We've created a "set and forget" your data, which we call "zero administration." When setting up an integration, we handle the schema and definition mapping to your target data destination automatically. As source data changes, all corresponding schema changes are registered and updated automatically in your target destination. See more on this approach here.

Once you activate a connector, your data is automatically collected, processed, and routed once you've created and configured a data source and your target data destination. This process can take up to 24-48 hours after set up depending on the data source.

Talk to your product specialist or submit a support ticket indicating the source you need, and your request is routed to the correct person. The more requests we get for a data source, the higher we prioritize building the new connectors.

Our pipelines are configured to handle new fields or tables added to your source gracefully, so you don't need to make changes on your end. We continuously monitor and stay ahead of changes or deprecations, so you don't need to think about it.

Most sources complete data sync in less than a day. However, the amount of data in the source and API rate limits may cause this to vary.

Our goal is to make every data source available for ordering online. However, we do not currently support online ordering for 3rd party integrations. There are a couple of extra steps involved in setting those up to ensure that your integration and the data you are looking for are available.

You can add as many as you need. If you manage a large number of accounts and want to consolidate the data from each, reach out to us for some tips and best practices on how best to set up that type of configuration.

In many cases, it is minutes. For 3rd party integrations, it takes 24-48 hours to get up and run. However, that timing can vary depending on a few variables. For example, there may be individual permissions, contracts, or setup required by a vendor that is outside our control.

If the integration you want isn't supported, you can check whether that data source supports access via batch SFTP export.

We are continuously expanding its support for integrations and targets. If you have an integration, you'd like to have us support which isn't currently, please contact us! An integration may already be in development, and there may be opportunities to join a beta group.

We know that data changes — schemas grow, shrink, and change. We've got your back! See our data catalog process that automatically adapts to changes.

Data Tools And Analytics

Issues related to Data Tools And Analytics.

3 answers

You have a lot of great and affordable choices these days! Tools like Tableau Software, Looker, Mode Analytics, and Microsoft Power BI are just a few options to consider.

Take a look at the full list of tools or read our DZone article about creating an objective 10-point business intelligence tool checklist to help narrow the field. If you have questions, feel free to reach out to us.

We are considered an Extract, Load, and Transform (ELT) platform with the "T" happening in external tools. Openbridge believes in increasing the velocity of data access, which means we focus on quickly delivering "analytics ready" data.

In most cases, additional transformations are not needed. As a result, we do not provide direct tooling or software to transform data. We suggest the use of best-in-class "transform" tools like Tableau Data Prep, Alteryx, or Trifacta, offering attractive and cost-effective solutions.

Yes, using standard SQL is supported. Amazon Redshift, Amazon Redshift, Amazon Amazon Redshift, and others support familiar SQL constructs. There may be some limitations or best practices for the specific use case, but the rule of thumb is that SQL is available.

General

General Q&A.

1 answers

A data pipeline reflects a connection from a specific source resource to a target destination. For example, you have two Facebook pages to collect insights data from and deliver to BigQuery. Each page would reflect a unique resource, equal to two data pipelines, one for each page ID.

Another example is Amazon Advertising. If you have 10 Amazon Advertising profiles that you want to deliver to Redshift, this would equal 10 data pipelines, one for each Amazon Advertising profile ID.


Related Help Center Categories

If you didn’t find what you needed, these could help!