DataStore API

The stix2 library features an interface for pulling and pushing STIX 2 content. This interface consists of DataStore, DataSource and DataSink constructs: a DataSource for pulling STIX 2 content, a DataSink for pushing STIX 2 content, and a DataStore for both pulling and pushing.

The DataStore, DataSource, DataSink (collectively referred to as the “DataStore suite”) APIs are not referenced directly by a user but are used as base classes, which are then subclassed by real DataStore suites. The stix2 library provides the DataStore suites of FileSystem, Memory, and TAXII. Users are also encouraged to subclass the base classes and create their own custom DataStore suites.

CompositeDataSource

CompositeDataSource is an available controller that can be used as a single interface to a set of defined DataSources. The purpose of this controller is allow for the grouping of DataSources and making get()/query() calls to a set of DataSources in one API call. CompositeDataSources can be used to organize/group DataSources, federate get()/all_versions()/query() calls, and reduce user code.

CompositeDataSource is just a wrapper around a set of defined DataSources (e.g. FileSystemSource) that federates get()/all_versions()/query() calls individually to each of the attached DataSources , collects the results from each DataSource and returns them.

Filters can be attached to CompositeDataSources just as they can be done to DataStores and DataSources. When get()/all_versions()/query() calls are made to the CompositeDataSource, it will pass along any query filters from the call and any of its own filters to the attached DataSources. In addition, those DataSources may have their own attached filters as well. The effect is that all the filters are eventually combined when the get()/all_versions()/query() call is actually executed within a DataSource.

A CompositeDataSource can also be attached to a CompositeDataSource for multiple layers of grouped DataSources.

CompositeDataSource API

CompositeDataSource Examples

[9]:
from taxii2client import Collection
from stix2 import CompositeDataSource, FileSystemSource, TAXIICollectionSource

# create FileSystemStore
fs = FileSystemSource("/tmp/stix2_source")

# create TAXIICollectionSource
colxn = Collection('http://127.0.0.1:5000/trustgroup1/collections/91a7b528-80eb-42ed-a74d-c6fbd5a26116/', user="user1", password="Password1")
ts = TAXIICollectionSource(colxn)

# add them both to the CompositeDataSource
cs = CompositeDataSource()
cs.add_data_sources([fs,ts])

# get an object that is only in the filesystem
intrusion_set = cs.get('intrusion-set--f3bdec95-3d62-42d9-a840-29630f6cdc1a')
print(intrusion_set.serialize(pretty=True))

# get an object that is only in the TAXII collection
ind = cs.get('indicator--a740531e-63ff-4e49-a9e1-a0a3eed0e3e7')
print(ind.serialize(pretty=True))
[9]:
{
    "type": "intrusion-set",
    "spec_version": "2.1",
    "id": "intrusion-set--f3bdec95-3d62-42d9-a840-29630f6cdc1a",
    "created_by_ref": "identity--c78cb6e5-0c4b-4611-8297-d1b8b55e40b5",
    "created": "2017-05-31T21:31:53.197Z",
    "modified": "2017-05-31T21:31:53.197Z",
    "name": "DragonOK",
    "description": "DragonOK is a threat group that has targeted Japanese organizations with phishing emails. Due to overlapping TTPs, including similar custom tools, DragonOK is thought to have a direct or indirect relationship with the threat group Moafee. [[Citation: Operation Quantum Entanglement]][[Citation: Symbiotic APT Groups]] It is known to use a variety of malware, including Sysget/HelloBridge, PlugX, PoisonIvy, FormerFirstRat, NFlog, and NewCT. [[Citation: New DragonOK]]",
    "aliases": [
        "DragonOK"
    ],
    "external_references": [
        {
            "source_name": "mitre-attack",
            "url": "https://attack.mitre.org/wiki/Group/G0017",
            "external_id": "G0017"
        },
        {
            "source_name": "Operation Quantum Entanglement",
            "description": "Haq, T., Moran, N., Vashisht, S., Scott, M. (2014, September). OPERATION QUANTUM ENTANGLEMENT. Retrieved November 4, 2015.",
            "url": "https://www.fireeye.com/content/dam/fireeye-www/global/en/current-threats/pdfs/wp-operation-quantum-entanglement.pdf"
        },
        {
            "source_name": "Symbiotic APT Groups",
            "description": "Haq, T. (2014, October). An Insight into Symbiotic APT Groups. Retrieved November 4, 2015.",
            "url": "https://dl.mandiant.com/EE/library/MIRcon2014/MIRcon%202014%20R&D%20Track%20Insight%20into%20Symbiotic%20APT.pdf"
        },
        {
            "source_name": "New DragonOK",
            "description": "Miller-Osborn, J., Grunzweig, J.. (2015, April). Unit 42 Identifies New DragonOK Backdoor Malware Deployed Against Japanese Targets. Retrieved November 4, 2015.",
            "url": "http://researchcenter.paloaltonetworks.com/2015/04/unit-42-identifies-new-dragonok-backdoor-malware-deployed-against-japanese-targets/"
        }
    ],
    "object_marking_refs": [
        "marking-definition--fa42a846-8d90-4e51-bc29-71d5b4802168"
    ]
}
[9]:
{
    "type": "indicator",
    "spec_version": "2.1",
    "id": "indicator--a740531e-63ff-4e49-a9e1-a0a3eed0e3e7",
    "created": "2017-11-13T07:00:24.000Z",
    "modified": "2017-11-13T07:00:24.000Z",
    "name": "Ransomware IP Blocklist",
    "description": "IP Blocklist address from abuse.ch",
    "indicator_types": [
        "malicious-activity",
        "Ransomware",
        "Botnet",
        "C&C"
    ],
    "pattern": "[ ipv4-addr:value = '91.237.247.24' ]",
    "pattern_type": "stix",
    "pattern_version": "2.1",
    "valid_from": "2017-11-13T07:00:24Z",
    "external_references": [
        {
            "source_name": "abuse.ch",
            "url": "https://ransomwaretracker.abuse.ch/blocklist/"
        }
    ]
}

Filters

The stix2 DataStore suites - FileSystem, Memory, and TAXII - all use the Filters module to allow for the querying of STIX content. Filters can be used to explicitly include or exclude results with certain criteria. For example:

  • only trust content from a set of object creators
  • exclude content from certain (untrusted) object creators
  • only include content with a confidence above a certain threshold (once confidence is added to STIX 2)
  • only return content that can be shared with external parties (e.g. only content that has TLP:GREEN markings)

Filters can be created and supplied with every call to query(), and/or attached to a DataStore so that every future query placed to that DataStore is evaluated against the attached filters, supplemented with any further filters supplied with the query call. Attached filters can also be removed from DataStores.

Filters are very simple, as they consist of a field name, comparison operator and an object property value (i.e. value to compare to). All properties of STIX 2 objects can be filtered on. In addition, TAXII 2 Filtering parameters for fields can also be used in filters.

TAXII2 filter fields:

  • added_after
  • id
  • spec_version
  • type
  • version

Supported operators:

  • =
  • !=
  • in
  • >
  • <
  • >=
  • <=
  • contains

Value types of the property values must be one of these (Python) types:

  • bool
  • dict
  • float
  • int
  • list
  • str
  • tuple

Filter Examples

[10]:
import sys
from stix2 import Filter

# create filter for STIX objects that have external references to MITRE ATT&CK framework
f = Filter("external_references.source_name", "=", "mitre-attack")

# create filter for STIX objects that are not of SDO type Attack-Pattnern
f1 = Filter("type", "!=", "attack-pattern")

# create filter for STIX objects that have the "threat-report" label
f2 = Filter("labels", "in", "threat-report")

# create filter for STIX objects that have been modified past the timestamp
f3 = Filter("modified", ">=", "2017-01-28T21:33:10.772474Z")

# create filter for STIX objects that have been revoked
f4 = Filter("revoked", "=", True)

For Filters to be applied to a query, they must be either supplied with the query call or attached to a DataStore, more specifically to a DataSource whether that DataSource is a part of a DataStore or stands by itself.

[11]:
from stix2 import MemoryStore, FileSystemStore, FileSystemSource

fs = FileSystemStore("/tmp/stix2_store")
fs_source = FileSystemSource("/tmp/stix2_source")

# attach filter to FileSystemStore
fs.source.filters.add(f)

# attach multiple filters to FileSystemStore
fs.source.filters.add([f1,f2])

# can also attach filters to a Source
# attach multiple filters to FileSystemSource
fs_source.filters.add([f3, f4])


mem = MemoryStore()

# As it is impractical to only use MemorySink or MemorySource,
# attach a filter to a MemoryStore
mem.source.filters.add(f)

# attach multiple filters to a MemoryStore
mem.source.filters.add([f1,f2])

Note: The ``defanged`` property is now always included (implicitly) for STIX 2.1 Cyber Observable Objects (SCOs)

This is important to remember if you are writing a filter that involves checking the objects property of a STIX 2.1 ObservedData object. If any of the objects associated with the objects property are STIX 2.1 SCOs, then your filter must include the defanged property. For an example, refer to filters[14] & filters[15] in stix2/test/v21/test_datastore_filters.py

De-Referencing Relationships

Given a STIX object, there are several ways to find other STIX objects related to it. To illustrate this, let’s first create a DataStore and add some objects and relationships.

[14]:
from stix2 import Campaign, Identity, Indicator, Malware, Relationship

mem = MemoryStore()
cam = Campaign(name='Charge', description='Attack!')
idy = Identity(name='John Doe', identity_class="individual")
ind = Indicator(pattern_type='stix', pattern="[file:hashes.MD5 = 'd41d8cd98f00b204e9800998ecf8427e']")
mal = Malware(name="Cryptolocker", is_family=False, created_by_ref=idy)
rel1 = Relationship(ind, 'indicates', mal,)
rel2 = Relationship(mal, 'targets', idy)
rel3 = Relationship(cam, 'uses', mal)
mem.add([cam, idy, ind, mal, rel1, rel2, rel3])

If a STIX object has a created_by_ref property, you can use the creator_of() method to retrieve the Identity object that created it.

[15]:
print(mem.creator_of(mal).serialize(pretty=True))
[15]:
{
    "type": "identity",
    "spec_version": "2.1",
    "id": "identity--a2628104-e357-44a0-b16f-d5f36c0fd0ec",
    "created": "2020-06-26T13:59:21.924055Z",
    "modified": "2020-06-26T13:59:21.924055Z",
    "name": "John Doe",
    "identity_class": "individual"
}

Use the relationships() method to retrieve all the relationship objects that reference a STIX object.

[16]:
rels = mem.relationships(mal)
len(rels)
[16]:
3

You can limit it to only specific relationship types:

[17]:
mem.relationships(mal, relationship_type='indicates')
[17]:
[Relationship(type='relationship', spec_version='2.1', id='relationship--ef837187-773c-41e4-ae86-c66189a832f5', created='2020-06-26T13:59:21.929336Z', modified='2020-06-26T13:59:21.929336Z', relationship_type='indicates', source_ref='indicator--9f10f6f2-b93d-488e-be35-72c3ec1087c3', target_ref='malware--315597db-2a74-4a29-8e54-38572e1ac07b')]

You can limit it to only relationships where the given object is the source:

[18]:
mem.relationships(mal, source_only=True)
[18]:
[Relationship(type='relationship', spec_version='2.1', id='relationship--43f5f7a7-8a99-4bbf-8d93-e6f3fd2951a3', created='2020-06-26T13:59:21.937132Z', modified='2020-06-26T13:59:21.937132Z', relationship_type='targets', source_ref='malware--315597db-2a74-4a29-8e54-38572e1ac07b', target_ref='identity--a2628104-e357-44a0-b16f-d5f36c0fd0ec')]

And you can limit it to only relationships where the given object is the target:

[19]:
mem.relationships(mal, target_only=True)
[19]:
[Relationship(type='relationship', spec_version='2.1', id='relationship--ef837187-773c-41e4-ae86-c66189a832f5', created='2020-06-26T13:59:21.929336Z', modified='2020-06-26T13:59:21.929336Z', relationship_type='indicates', source_ref='indicator--9f10f6f2-b93d-488e-be35-72c3ec1087c3', target_ref='malware--315597db-2a74-4a29-8e54-38572e1ac07b'),
 Relationship(type='relationship', spec_version='2.1', id='relationship--596c196f-2f05-4584-b643-2186b327a94f', created='2020-06-26T13:59:21.937354Z', modified='2020-06-26T13:59:21.937354Z', relationship_type='uses', source_ref='campaign--d359f872-7e44-4090-8e08-c5bd10bc5f2d', target_ref='malware--315597db-2a74-4a29-8e54-38572e1ac07b')]

Finally, you can retrieve all STIX objects related to a given STIX object using related_to(). This calls relationships() but then performs the extra step of getting the objects that these Relationships point to. related_to() takes all the same arguments that relationships() does.

[20]:
mem.related_to(mal, target_only=True, relationship_type='uses')
[20]:
[Campaign(type='campaign', spec_version='2.1', id='campaign--d359f872-7e44-4090-8e08-c5bd10bc5f2d', created='2020-06-26T13:59:21.923792Z', modified='2020-06-26T13:59:21.923792Z', name='Charge', description='Attack!')]