Labelbox

On-Premise or VPN/VPC Data

Labelbox works with source data hosted on-premise or on a private cloud. The source data is accessed directly from the client computer and never shared (or accessible) by Labelbox.

Using Labelbox with on-premise or private cloud source data means Labelbox will not have access to any of the source data (images, text, etc...), and Labelbox will only store the label annotation data created in the interface.

Private Cloud Data using VPN/VPC

To enable teams of remote workers to access on-premise or private cloud data, use a VPN. Here's some resources on setting up a VPN tor popular hosting providers:

Create a File with URLs

If your data is hosted in the cloud (e.g. Amazon S3), you can point Labelbox to your data by creating a JSON file or a CSV file with URLs to each file.

Creating a JSON file (Recommended)

Create a JSON file containing the data URLs. For example, here is a JSON snippet for importing data hosted in Google Storage: Download an example JSON file here.

[
  {
    "externalId": "ab65d5e99w13",
    "imageUrl": "https://storage.googleapis.com/labelbox-example-datasets/tesla/104836109-p100d-review-5.1910x1000.jpeg"
  },
  {
    "externalId": "ljk6s544a7f8",
    "imageUrl": "https://storage.googleapis.com/labelbox-example-datasets/tesla/2017-Tesla-Model-3-top-view.jpg"
  }
]

Creating a CSV File

The first column is the URL and the second column is the External ID (Optional). Example CSV file containing image URLs.

Data_URLs,External_ID
http://res.cloudinary.com/ddpai9fpa/image/upload/v1516660804/isu8zqke6xoopnemuvvc.jpg,ID1
http://res.cloudinary.com/ddpai9fpa/image/upload/v1516660805/ldie4gmhaqfhw1df1wls.jpg,ID2
http://res.cloudinary.com/ddpai9fpa/image/upload/v1516660805/dvbb5kv3dudxhibqpuni.jpg,ID3
http://res.cloudinary.com/ddpai9fpa/image/upload/v1516660806/inm7ipr8h9ecx1fzcspm.jpg,ID4

Labelbox expects a CSV with at least the first column, Data URL , populated with URLs to each data asset. The second column, External ID , (optional) can contain a user defined unique ID for that data asset.

Note on URL protocol

Be sure that the URLs you provide to Labelbox have the appropriate protocol prefixed to them. Typically for a locally hosted server that is http:// but if your server supports https:// that is always recommended.

Upload the CSV to Labelbox

On-Premise Data

MacOS and Linux Only

A step-by-step guide to using Labelbox with data on your hard drive. First we'll start a HTTP server running locally to serve up the files, then we'll generate a CSV of links to the files and upload them to Labelbox.

Place all Files in a Single Folder

Put all of the files you want to label in a single folder on your hard drive.

Get the IP address of your Computer

The command below should yield your ipaddress, ex: 192.168.1.112

ifconfig en0 | grep inet | grep -v inet6 | awk '{print $2}'

Start the HTTP Server on your Computer

You can start a local server via the command line.

#To use Python  
python -m SimpleHTTPServer

#To use NodeJS: 
npm install -g http-server; http-server -p 8000
#*Note in this example ```8000```  is the port we are serving from.*

Create a CSV with Your Data

Now if you visit http://<your-ip-address>:8000 you should see a directory listing with all your files. Now cd into the directory with all of your files and run the below command that will generate data.csv

IP_ADDRESS=$(ifconfig en0 | grep inet | grep -v inet6 | awk '{print $2}')

CSV=$(echo "Data URL"; for fileName in $(ls); do echo http://$IP_ADDRESS:8000/$fileName; done)

echo "$CSV" > data.csv

Upload data.csv

to https://app.labelbox.com/data
See above for detailed information on the CSV format Labelbox expects.

Notes

Only users on the same network can see your data when self hosting
If the local server is stopped you will lose access to your data while using Labelbox.