Register New Batches in ScienceCloud with Project Data Components

Last Updated: Apr 30, 2018 09:10AM PDT

Summary: This article provides an overview for creating a simple protocol that registers new batches for compounds stored in an SD file. The protocol used in this example is available in the protocol database as "Example from help topic Registering New Batches in SC".

Every protocol that connects to the Notebook needs to start with the ScienceCloud Connection component. It contains information about the target ScienceCloud server and about the authentication method to use when connecting to that server.

This example registers the 10 first compounds of NCI drugs in ScienceCloud. The data records coming out of the NCI Drug Reader component have three properties – cas_rn, name, and NSC.


The Define New Batch component creates a local Batch data record with the properties required for registration. Important points to consider include:

  • Defining a new batch requires, at the minimum, a Connection Name, Project and Batch Group ID.
  • Several Options are provided. For example, you can Validate that the resulting batch object could be registered.
  • All other parameters are PilotScript expressions. Their values can be set using a combination of static text and properties. All are optional – if you do not provide them, defaults are used (or read from the data property stream, if specific property names already exist on the incoming data).
  • The Batch ID parameter is of particular note. You may provide a specific Batch ID, or the ScienceCloud team to which you are connected may be configured to automatically generate a Batch ID for you. If you provide a Batch ID, it cannot already exist on another batch within the ScienceCloud team to which you are connected.

In this case, we set the Batch ID to the value of the property cas_rn, prefixed with the string 'ASC'. The Batch Tag is set to the value of the property NSC, and a comment is added.

Note: If you run this protocol, you will have to select a different prefix so that the Batch ID is not already in use.

If a Batch data record is returned out the Pass port, then the record is ready for registration into ScienceCloud. If the record is rejected out the Fail port, then an Error property will contain the reason for rejection.

Your Batch data record can now be sent to any component that accepts batches. In most cases, you will send the data record to the Register New Batches component.

There are several parameters on Register New Batches that affect registration or the returned Batch.

  • Validate Structure – If True, executes the project's validation protocol to check whether the structure follows the business rules decided by the customer (and even possibly gets repaired).
  • Check Existing Structure – If True, searches for duplicate compounds in the database and, if found, adds this batch as a new batch to that existing structure.
  • Ignore Structure – If True, ignores any structural information and only registers the batch properties.
  • Return New Batch – If True, returns all properties of the new batch, including the values of all defaulted properties. If False, then only the incoming properties are returned (with the addition of BATCH_ID and COMPOUND_ID).
The Batch to Generic Data component cleans up the data for viewing in the viewer of your choice. The property names that are displayed correspond to the field names shown in the Project Data web interface. You can also view the raw data, to see the internal property names for the various fields.

Here are the first two molecules as output:


Project Data Collection Overview

Project Data Collection Components

Connecting to ScienceCloud with Project Data Components

Querying Batches and Saving as Hit Lists in ScienceCloud with Project Data Components
seconds ago
a minute ago
minutes ago
an hour ago
hours ago
a day ago
days ago
Invalid characters found