Listing setting

Products

Object Storage Service

2022-02-18 08:33:07

List Setting

List is a function used to help users to manage the storage space objects. It is used to periodically (daily/weekly) generate a list file in a specific format (CSV files are currently supported) for all or part of the objects in the Bucket and store the file in the specified Bucket, and can systematically replace the object storage synchronization List API operation

Based on the list of objects, users can complete some business statistics or batch operation; users can configure multiple list tasks in one Bucket to meet the demands of different dimensions.

The list file will list the stored objects and their corresponding metadata, and record the object attribute information required by users based on their configuration information. During the execution of list tasks, the object content will not be directly read, but only the attribute information of object metadata will be scanned.

A list is generated periodically (daily or weekly); from the date of creating the list configuration, the list files are generated after the specified number of days in the generation cycle. If the list is generated weekly, one report will be generated every 7 days after the initial report is generated.

How to configure a list?

This part will describe how to configure a list, including detailed information about the list source storage space and target storage space.

List Source Storage Space and Target Storage Space

The storage space of objects manifested in the list is known as Source Storage Space. The storage space where the list files are stored is called Target Storage Space. Before the list is configured, let's start with two concepts:

Source Storage Space

It is the storage space where the list function is to be enabled. The list manifests the objects stored in the source storage space. You can get a list of the entire storage space or a list filtered by prefix (object key name).

Source Storage Space:

  • contains the objects manifested in the list.
  • contains the list configuration.

Target Storage Space

It is the storage space where the list is stored. The files manifested in the list will be written in the target storage space, and all list files at the public location in the target storage space will be grouped. You can specify the target (object key name) prefix in the list configuration.

Target Storage Space:

  • contains the list of files.
  • contain Manifest files, manifesting the list of all files stored in the target storage space.
  • must have the storage space policy that files are written in the storage space permissions.
  • must be located in the same region (Region) as the source storage space, and both of them can be the same storage space.
  • can be the same as the source storage space.
  • can be owned by a JD Cloud & AI account different from the one with the source storage space.

Configure a list

The list will help you to create a list of objects in the storage space according to your predetermined plan for the purpose of storage management. You can configure multiple lists for the storage space. Lists will be sent to the CSV file in the target storage space.

1. Specify the object information to be analyzed in the source storage space

Determine which object information are to be analyzed. Therefore, the following information shall be configured in the source storage space when the list function is configured:

  • configure the object attribute to be analyzed: You need to specify which information in the object attribute is recorded in the list report. At present, the supported objects include account ID, source storage space name, object file name, object size, last modified date of object, ETag and storage type of object.

2. Configure the storage information of the list report

You need to specify the storage space policy of the target storage Space, i.e. A list of reports is generated daily or weekly, and which storage space the list of reports is to be stored, and the configuration information required is as follows:

  • Select the frequency of exporting a list: A list is generated daily or weekly. You can select the required frequency through this configuration to execute the list function.
  • Configure the output location of lists: You need to specify the storage space where the list report is to be stored.

Which parameters are contained in a list?

A list file contains the list of objects in the source storage space and the metadata of each object. A list will be stored in the CSV format compressed by GZIP in the target storage space.

A list contains the list of objects in the storage space and the following metadata of each object listed:

  • Bucket name – Name of storage space to which the list is targeted.
  • Key name – name of an object file in the storage space, which uniquely identifies the object key name (or key) of the object in the storage space. When the CSV file format is used, the object file name is URL-encoded and must be decoded before use.
  • Size – Object size (unit: byte).
  • Last modified date – object creation date or last modification date (whichever is later).
  • ETag – Entity tag is the hash of object. ETag only reflects the changes in object content, rather than reflecting the changes in object metadata. ETag might be or might not be the MD5 abstract of object data, depending on the creation method and encryption method of object.
  • Storage class – storage type used to store objects. For more information, please refer to Storage Type.

Use Method

Configure a list via the console

You can understand how to configure a list via the console by reference to the following general list function [Console Operation Guide] .

Configure a list via the API

You can understand how to configure the list function via API by reference to the following API document:

List Report Storage Path

List reports and relevant Manifest files will be sent to the target storage space, and the list reports will be distributed to the following path:

destination-prefix/source-bucket/config-ID/

The relevant Manifest files will be distributed to the following location in the target storage space:

destination-prefix/source-bucket/config-ID/YYYY-MM-DDTHH-MMZ/manifest.json
destination-prefix/source-bucket/config-ID/YYYY-MM-DDTHH-MMZ/manifest.checksum
destination-prefix/source-bucket/config-ID/hive/dt=YYYY-MM-DD-HH-MM/symlink.txt

The lists will be distributed to the following location in the target storage space daily or weekly:

destination-prefix/source-bucket/config-ID/data/example-file-name.csv.gz

The meaning represented by paths is as follows:

  • destination-prefix: is the “target prefix” set by users during the list configuration, which can be used to group all list files at a public location in the target storage space.
  • source-bucket : is the name of source storage space corresponding to the list report, which can be used to avoid any conflict that may occur when multiple source storage spaces separately send their own list reports to the same target storage space. .
  • config-ID: is the “List Name” set by users during the list configuration, which can be used to avoid any conflict between multiple list reports sent from the same source storage space to the same target storage space. You can use config-ID to distinguish different list reports.
  • YYYY-MM-DDTHH-MMZ : is a timestamp, containing start time and date of scanning the storage space when a list report is generated; e.g. 2020-04-28T00-32Z.
  • manifest.json : is a Manifest file.
  • manifest.checksum : is MD5 of manifest.json file content.
  • symlink.txt : is a Manifest file compatible with Apache Hive.
  • example-file-name.csv.gz: is one of the CSV list files.

The relevant Manifest file includes two files: manifest.json and manifest.checksum.

The description of Manifest file is as follows:

What is a list Manifest?

The Manifest files manifest.json and symlink.txt describe the location of the inventory report. Every time a new inventory report is delivered, it comes with a new set of manifest files.
Whenever a manifest.json file is written, it comes with a manifest.checksum file (as an MD5 of the contents of the manifest.json file).
Each Manifest contained in manifest.json file provides the metadata of relevant lists and other basic information, and the information includes:
● Source storage space name.
● Target storage space name.
● Inventory version.
● Timestamp, containing start time and date of scanning the storage space when a list report is generated.
● The format and architecture of the inventory file.
● Actual list of list reports in the target storage space, size and md5Checksum.

The example of Manifest in manifest.json file in the CSV format is as follows:

{
 "sourceBucket": "example-source-bucket",
 "destinationBucket": "example-inventory-destination-bucket",
 "fileFormat": "CSV",
 //"version": "2016-11-30",
 "creationTimestamp": "1514944800000",
 "fileSchema": "Bucket, Key, VersionId, Size, LastModifiedDate, ETag, StorageClass, IsMultipartUploaded, ReplicationStatus",
 "files": [
  {
   "key": "Inventory/example-source-bucket/2016-11-06T21-32Z/files/04d73d9debc73d9f0bf85af461abde6c.csv.gz",
   "size": 21999232,
   "MD5checksum": "7d40288a09c25b302ad6cb5fced54f35"
  }
 ]
}

List Consistency

The list report provides the final consistency of the new object and the overwritten PUT, and provides the final consistency of the DELETE. A manifest list is a rolling snapshot of storage space items which are ultimately consistent (i.e. the list may not contain objects that were recently added or removed). For example, when a user performs an operation to upload or delete an object during the execution of a user-configured list task, these operation results may not be reflected in the list report.

If you need to verify the status of an object before it execute the operation, it is recommended to execute the HEAD Object API request to retrieve the metadata of this object or inspect the object attribute on the object storage console.

Console Operation Guide

List Function

OSS list provides a list of flat files of your objects and metadata, and this list will systematically supersede OSS to synchronize List API operation. OSS list provides the object of storage space or shared prefix (i.e. an object of which the name starts with the same character string) with comma-separated values (CSV) used for listing your object and its corresponding metadata on a daily or weekly basis.

List Configuration Steps

  • It might take 48 hours to deliver the first report.
  1. Log into Object Storage Console;
  2. In the [Space Management] list, select the storage space (source storage space) that you want to use the list function and then click to enter the space;
  3. Click [Space Setting] tab and then select [List Setting] option;
  4. Click [Add a List] ;
  5. On the [Add a List] page, you can configure a list as per the following mode:
    • List name: Enter your output list name.
    • Filtering condition: (Optional) add a prefix to the filtering condition. You can just check the objects of which the name starts with the same character string. If no value is entered, unconditional filtering will be the default.
    • Target storage space: Select the target storage space where the report is to be stored. The default target storage space is the source storage space, and the target storage space and the source storage space must be located in the same region. The target storage space can be located in a different JD Cloud account.
    • Target prefix (optional): (Optional) You can select a prefix for the target storage space and group the list files at the public location.
    • Frequency: Select the frequency of generating a list. It is exported on a daily/weekly basis; if no selection is made, daily exporting is the default.
    • Status: You can select Enable/Disable a List.
    • [Advanced Setting] : In Advanced Setting, you can configure more list information. If no advanced setting is made, All is the default.
    • Output format: It is output in the CSV format by default.
    • List Information: Select the corresponding information of objects to be contained in the list report. The optional items are as follows: object size, storage type, ETag, and last modification time. If no selection is made, All is the default.
    • Encrypt: Select whether the list needs server-side encryption. The list report is not encrypted temporarily.
  6. Confirm that the configuration information is correct, and then click [Save] to complete the addition.

Target storage space policy

Create the storage space permission policy on the target storage space that grants OSS write permissions. In this way, OSS can write the data of list report in the storage space.

If you select the target storage space in another account and do not have permission to read and write the storage space policy, you will see the following messages: ‘Failed to save. Failed to create a Bucket policy on the destination Bucket. Please contact the destination Bucket owner to add the relevant Bucket policy and allow this account to place data in Bucket’. In this case, target storage space owner must add the displayed storage space policy to the target storage space. If the policy fails to be added to the target storage space, you will not get the list report, because the source storage space owner has no permission to write it in the target storage space.

Feedback

开始与售前顾问沟通

可直接拨打电话 400-098-8505转1

我们的产品专家为您找到最合适的产品/解决⽅案

在线咨询 5*8⼩时

1v1线上咨询获取售前专业咨询

点击咨询
企微服务助手

专业产品顾问,随时随地沟通