Google Cloud Storage Upload Bytes to Blob

April 14, 2022 Post a Comment

Chapter 4. Storage: Deject Storage

Google Cloud Storage provides the ability to store binary large objects (a.m.a. BLOBs) in the cloud. Yous can read or write BLOBs from anywhere on the Cyberspace, bailiwick to admission command that you can ascertain, including from any Compute Engine instance, no matter which zone it is in. Your BLOBs are besides stored in a durable, available, and secure mode, and are served past a massively scalable distributed infrastructure.

This makes Cloud Storage ideal for two key scenarios:

Storing and retrieving unstructured data from whatsoever Compute Engine example in any zone
Storing and retrieving unstructured information from both inside and outside Compute Engine (e.g., for sharing data with your customers)

A particularly of import special case of the second scenario is importing or exporting big quantities of data into and out of Google's Cloud Platform.

Information technology is important to understand the difference between durability and availability when it comes to storage. Roughly speaking, durability is defined by the probability that an object will not be permanently and irrevocably lost, whereas availability is defined by the probability that an object tin can be retrieved at a particular point in fourth dimension. Deject Storage provides a service-level understanding (SLA) that clearly specifies what level of availability the customer should expect and what sort of compensation the client will receive if that SLA is not met. Cloud Storage offers multiple classes of storage, at different prices, with unlike SLAs. However, in all the current offerings, the data is stored in a highly durable manner, which means that even if you lot tin can't access your data right now, it is not lost forever. It will go available again at some point in the (non too distant) future. Google does not provide an estimate of durability. In practice, with multiple copies of your data stored in multiple data centers, the probability of losing every copy of your data is extremely low, making the theoretical durability incredibly high.

Understanding Hulk Storage

BLOB has get a common industry term for a file of any type. While formats such as ASCII or Unicode are not generally considered binary files, they are made up of ones and zeros just like JPEG images, MPEG movies, Linux-executable files, or any other type of file. Cloud Storage is considered a Blob Storage system considering it treats all files every bit unstructured binary data.

Similarly, there's no particular reason why a BLOB needs to be particularly large. Cloud Storage is perfectly happy to store an object that has aught bytes of content. BLOBs in Cloud Storage can be up to v TB in size.

At this signal, you might be maxim to yourself, "A BLOB sounds a whole lot similar a file to me." And you would be exactly right. However, one reason the industry has taken to referring to this manner of storage every bit a "BLOB shop" instead of "filesystem," and thus calling the contents "BLOBs" or "objects" instead of "files" is because the give-and-take "filesystem" implies a great deal of functionality that and so-called "Hulk stores" typically do non provide. Not providing sure filesystem features offers some useful scalability tradeoffs. Later we've taken a tour of Cloud Storage, we'll return to this topic and examine information technology in more detail; merely for now, but proceed in listen that while Cloud Storage may look and experience a whole lot like a filesystem, especially when viewed through the lens of some higher-level tools (e.g., gsutil, Cloud Console), in the end, information technology's non a traditional filesystem, and if you await it to behave exactly like the filesystems you lot are accustomed to, you may get some surprises. For case, Deject Storage does not have directories, in the traditional sense.

The Cloud Storage documentation refers to an individual piece of information stored in Deject Storage every bit an "object," not as a "Hulk," and throughout this volume, we will use the term "object" as well.

Getting Started

Go to http://cloud.google.com/panel and select the project you created in the "Creating a Compute Engine Project" section in Chapter 1. In the lefthand navigation bar, click Storage > Cloud Storage > Storage browser. Assuming you have non used this project to admission Cloud Storage earlier, you should run into a welcome message, as shown in Figure iv-1.

Equally the UI suggests, your get-go act should be to create a new bucket. A bucket is a container for objects. Printing the "Create a saucepan" button, and enter a name for your new bucket. Normally, bucket names may comprise only lowercase letters, dashes, and underscores. Too, bucket names must be globally unique across the entire Cloud Storage service, and then if you choose something obvious, similar "examination," there'southward a skilful chance you'll get an error, because someone else already created a saucepan with that name.

As mentioned earlier, Cloud Storage does not back up the concept of directories in the traditional sense. While a bucket is a container for objects, similar to how a directory is a container for files, you cannot nest buckets inside buckets, the way you lot can nest subdirectories into parent directories in a bureaucracy like most filesystems provide.

If you lot've created a bucket, and so congratulations! Your project is set up up to utilise Cloud Storage. You tin employ the Storage browser in the Cloud Console to create and delete buckets, upload objects, download objects, delete objects, and accommodate object permissions and metadata. If yous're only working with a handful of objects, this is probably the quickest and easiest way to do what you demand. However, just as with Compute Engine, at that place are several ways to use Deject Storage, including an API and a control-line tool, gsutil, which we examine next.

Introducing gsutil

In before chapters, you have been using the gcloud compute command to interact with Compute Engine. gsutil is the equivalent control for Cloud Storage. Allow'south create a Compute Engine instance called exam-vm then we can take gsutil for a spin. Note the employ of the --scopes flag, which was introduced in Chapter 2:

$            gcloud compute instances create test-vm \            --zone us-central1-a --scopes storage-full            [..]

Now nosotros can ssh into your new instance to take gsutil for a spin:

$            gcloud compute ssh test-vm --zone usa-central1-a            [..]

The gsutil ls command gives yous a list of your buckets, and nosotros can see the bucket we created using the Cloud Panel Web UI in the previous department. Annotation that because of the global bucket namespace, your bucket proper name will be different than the saucepan name shown in this sample output:

test-vm$            gsutil ls            gs://gce-oreilly-example/

gsutil uses a URI syntax to allow the user to express, "This is a local file" (e.g., file://path/to/local/file), versus, "This is an object in Google Cloud Storage" (east.g., gs://bucket/object), versus, "This is an object in another deject storage system" (e.grand., s3://saucepan/object). If yous don't specify a scheme on the URI, gsutil assumes yous mean a local file.

At present allow's create our first object. Exist sure to use your bucket name with these commands, not the bucket name shown in this instance:

exam-vm$            echo 'Hullo Cloud Storage!' > hello            examination-vm$            gsutil cp howdy gs://gce-oreilly-instance            Copying file://howdy [Content-Type=application/octet-stream]... test-vm$            gsutil ls gs://gce-oreilly-example            gs://gce-oreilly-example/hello examination-vm$            gsutil true cat gs://gce-oreilly-example/hello            Hi Cloud Storage!

You have now stored and retrieved an object in Cloud Storage. If you go dorsum to the Cloud Console Spider web UI page y'all were using earlier and click your bucket name, you should now see the howdy object there.

There's a off-white chip going on hither, so let's break it downwardly. First of all, you'll notice that yous did non need to install gsutil. The images provided by Compute Engine already accept a version of gsutil installed and ready to go.

There are many occasions where you'll want to utilize gsutil exterior of a Compute Engine instance. For example, maybe you accept files on your development workstation that you want to upload to a Cloud Storage bucket, so y'all can so operate on that data from Compute Engine. Fortunately, if y'all followed the instructions in Chapter 1 to install the Cloud SDK, you already accept a copy of gsutil installed on your workstation.

Next, you'll discover that you didn't demand to provide any credentials to gsutil: no OAuth menstruum, no editing configuration files. Somehow, information technology obtained appropriate credentials to act on Cloud Storage. As we discussed in Chapter 2, this particular piece of magic is enabled via the --scopes flag that y'all passed to gcloud compute when you asked it to create the example. What you did with that flag is tell Compute Engine that you want programs running on this instance to exist able to use the service business relationship that was automatically created when y'all created your projection. The storage-total part of that flag tells it what services you want those programs to be able to utilise (in this case, Cloud Storage). Finally, gsutil understands that it is running in a Compute Engine instance configured this way and automatically acts every bit the project's service account, because that's obviously what yous intended if you created the instance using the --scopes flag.

This is why you were able to ssh into your freshly created Compute Engine example and immediately issue the gsutil ls control and see a list of the buckets owned by the project that owns the Compute Engine example. If yous signed into a different example owned by a different project, you lot would see that project's buckets instead.

gsutil is a very powerful tool that exposes every significant feature provided past Cloud Storage. Because this book is well-nigh Compute Engine and not Cloud Storage, there's a lot we don't have space to encompass. Withal, spending some quality time with gsutil's extensive built-in help is highly recommended. The gsutil aid command is your starting point.

One of the many useful features of gsutil is that it can transparently work with local files (e.g., /tmp/my-local-file), objects in Deject Storage (e.g., gs://my-bucket/my-object), or objects in Amazon's S3 service (e.grand., s3://my-s3-saucepan/my-s3-object). This means the post-obit gsutil command is legal and does exactly what y'all might expect (copy all the objects in an S3 saucepan to a Cloud Storage bucket):

$              gsutil cp s3://my-s3-bucket/* gs://my-gcs-bucket

If the bucket is large, yous'll probably want to use the -yard (multithreaded) control-line switch as well. -l (log) and -n (noclobber) are likewise very handy for this sort of operation, as is the gsutil rsync command. gsutil help can tell you more than about those options and commands.

Using Cloud Storage from Your Code

Every bit useful as the Cloud Panel and gsutil tin can be, at that place may come a time when you need to perform some operations on your objects or buckets in the context of a larger programme. Shelling out to gsutil is ever an option, simply may not always exist the best option. Fortunately, Cloud Storage provides a full featured API that your programs can use to interact directly with your objects and buckets. In fact, it provides two APIs: an XML-oriented one and a JSON-oriented one.

The two APIs provide almost all the same functionality, just with different styles. If you're starting from scratch, the JSON API is probably the one you desire to use and is the 1 we will demonstrate here. It is consequent in style and structure with other Google APIs such as Google Maps, Google+, and Google Analytics. This makes it possible for Google to provide helpful customer libraries in many different languages and useful tools such as the Google Plugin for Eclipse. The consistency between Google APIs makes is easier for a programmer who is familiar with 1 Google API to be immediately productive with a different Google API.

The XML API, not coincidentally, closely resembles Amazon'south S3 Balance API, making it piece of cake for developers to add support for Cloud Storage to existing tools, libraries, and other code that was originally written for employ with S3. If yous accept some existing code that works with S3 and y'all want to migrate to Deject Storage, the XML API makes that easier.

Before writing whatever lawmaking, you need to install the Google APIs Client Library for Python. These Python libraries make it easier to work with many different Google APIs, not just Cloud Storage. pip is a dandy tool for installing Python packages and is available via apt-become on Debian-based Compute Engine instances. ssh into the test-vm example you lot created before and run these commands:

examination-vm$            sudo apt-become update            [..] test-vm$            sudo apt-get install python-pip            [..] test-vm$            sudo pip install --upgrade google-api-python-client            Downloading/unpacking google-api-python-client [..] Successfully installed google-api-python-client httplib2 Cleaning upward...

The following control downloads a simple Python program that demonstrates how to access an object in Cloud Storage:

exam-vm$            gsutil cp gs://gce-oreilly/hello_cloud_storage.py .            Copying gs://gce-oreilly/hello_cloud_storage.py...

Here is the content of hello_cloud_storage.py:

            import            httplib2            from            apiclient.discovery            import            build            from            oauth2client            import            gce            # These two lines take care of all the ofttimes tricky authorization            # steps by getting us an Http object that automatically adds the            # appropriate Authorization: header to our requests, using the            # service business relationship associated with the project that owns the Compute            # Engine instance on which this program is running. Note that for            # this to work, the --scopes=storage-full flag must exist specified to            # gcloud compute when the case was created.            credentials            =            gce            .            AppAssertionCredentials            (            scope            =            'https://world wide web.googleapis.com/auth/devstorage.read_write'            )            http            =            credentials            .            qualify            (            httplib2            .            Http            ())            # The Google APIs library dynamically builds a Python object that            # understands the operations provided by Cloud Storage. Every API            # has a proper noun and a version (in this example, 'storage' and 'v1').            # Depending on when you are reading this, you may observe there is a            # newer version of the 'storage' API available.            storage            =            build            (            'storage'            ,            'v1'            )            # Google APIs expose collections, which typically expose methods that            # are common across many APIs, such every bit listing(), or get(). In the case            # of Cloud Storage, the get() method on the objects drove gets            # an object'due south metadata, and the get_media() method gets an object'southward            # data.            request            =            storage            .            objects            ()            .            get_media            (            bucket            =            'gce-oreilly'            ,            object            =            'hello'            )            # Also annotation that get_media(), and other methods, do non perform the            # action straight. They instead render an HttpRequest object that can            # exist used to perform the activeness. This is important, because it gives            # us the opportunity to authorize our request past passing in the Http            # object we created earlier that knows near our service account.            print            request            .            execute            (            http            =            http            )            # The previous phone call to get_media() fetched the object'southward data. This            # call to get() will fetch the object's metadata. The Google API            # libraries conveniently accept care of converting the response from            # the JSON used on the network to a Python lexicon, which nosotros can            # iterate over to print the object'due south metadata to the console.            request            =            storage            .            objects            ()            .            become            (            bucket            =            'gce-oreilly'            ,            object            =            'hello'            )            metadata            =            asking            .            execute            (            http            =            http            )            for            key            ,            value            in            metadata            .            iteritems            ():            print            cardinal            +            "="            +            str            (            value            )

When you run this through the Python interpreter, you should meet the contents of gs://gce-oreilly/hi (in URI parlance) and the metadata associated with the object:

test-vm$            python hello_cloud_storage.py            Hello Cloud Storage! kind=storage#object contentType=application/octet-stream name=hullo etag=CLDgk+KZhrwCEAI= generation=1389995772670000 md5Hash=ocbFPgjShy+EHAb+0DpjJg== bucket=gce-oreilly [..] size=21

While dumping the contents and metadata of a unmarried object to the console is patently the simplest possible Deject Storage programming task, this withal demonstrates several key points. Kickoff, Compute Engine makes it like shooting fish in a barrel to use Cloud Storage via the born service account support. Second, using the Cloud Storage JSON API ways you practise not need to laboriously assemble correctly formatted custom HTTP requests. The Google APIs client library understands the operations available and how to codify the appropriate requests. Third, the Google APIs library handles translating JSON responses into convenient native Python dictionaries.

Configuring Access Control

Up to this point, we accept seen how to create buckets and objects and read their contents using the Deject Panel Web UI, gsutil, and your own custom Python code. We have e'er been interim either as ourselves every bit an possessor of our projection, or as the automatically created project service business relationship, which is also a member of the projection. Unsurprisingly, owners and members of the project accept, past default, the appropriate access rights to create buckets owned by that project, and create and read objects in those buckets. Where admission command gets interesting, and where Cloud Storage gets particularly useful, is when y'all desire to requite specific rights to people or service accounts that are not part of your project. This is also where we volition starting time to see some of the meaning differences betwixt Cloud Storage and traditional filesystems.

Every object in Deject Storage has an access control listing (ACL). You can use gsutil acl go to come across the ACL practical to an object.

What's an ACL? An ACL is a list of people or groups that you're granting permission to perform specific operations on an object. ACLs are more explicit and flexible than the permission bits you may be accustomed to working with on UNIX-style filesystems, which just permit you to specify permissions for the file'southward "possessor," "group," and "anybody else," because you are not limited to granting permissions simply to a single individual (the owner) and a single grouping.

If yous are non already logged into your examination-vm, utilise the gcutil ssh command to do so and try the post-obit example (using your bucket name instead of the one shown hither, of form):

test-vm$            gsutil acl become gs://gce-oreilly-case/hello            [   {     "entity": "projection-owners-1342[..]",     "projectTeam": {       "projectNumber": "1342[..]",       "squad": "owners"     },     "office": "Possessor"   },   {     "entity": "project-editors-1342[..]",     "projectTeam": {       "projectNumber": "1342[..]",       "team": "editors"     },     "role": "Owner"   },   {     "entity": "projection-viewers-1342[..]",     "projectTeam": {       "projectNumber": "1342[..]",       "squad": "viewers"     },     "role": "READER"   },   {     "entity": "user-00b4[..]145f",     "entityId": "00b4[..]145f",     "role": "OWNER"   } ]

How to read this? Observe that "entities" are being assigned "roles." In this particular case, the first 3 entities are groups that correspond to the diverse team members of your project whom you've added with "Is Owner," or "Tin Edit," or "Can View" permissions. This is what Cloud Storage calls a projection-private "canned" ACL. At that place are other and so-called canned ACLs that are useful for mutual scenarios. The project-private canned ACL is a reasonable default ACL for many situations, giving the project team members reasonable default rights, while making certain that no 1 outside the project can access the object. You tin can employ a dissimilar canned ACL via gsutil. For case, if you want to make the object completely private to yourself, the private canned ACL will practice the play a trick on:

examination-vm$            gsutil acl gear up individual gs://gce-oreilly-example/hello            Setting ACL on gs://gce-oreilly-example/hullo... examination-vm$            gsutil acl get gs://gce-oreilly-example/hullo            [   {     "entity": "user-00b4[..]145f",     "entityId": "00b4[..]145f",     "role": "Possessor"   } ]

Y'all're now the only one in the globe who can admission this item object. You have the correct to modify the ACL because you are the OWNER of the object. Similarly, if you desire to share your object with everyone, you tin can utilise the public-read canned ACL:

test-vm$            gsutil acl set public-read gs://gce-oreilly-example/how-do-you-do            Setting ACL on gs://gce-oreilly-example/how-do-you-do... test-vm$            gsutil acl become gs://gce-oreilly-example/howdy            [   {     "entity": "user-00b4[..]145f",     "entityId": "00b4[..]145f",     "office": "Owner"   },   {     "entity": "allUsers",     "role": "READER"   } ]

Y'all can now meet that the entity allUsers has READER role for this object. Objects that give allUsers the READER role do not require authentication to be accessed. This means that anyone in the world can navigate a web browser to http://storage.googleapis.com/gce-oreilly-case/hello , and volition be able to fetch the object. If you desire this to work the way most users would expect, you may want to set an appropriate Content-Type (a.g.a. MIME blazon) on your object, so the browser will know what to do with it. If you lot do not set the Content-Type, Deject Storage uses the default of binary/octet-stream. About browsers will interpret binary/octet-stream every bit a file to be downloaded and inquire users where they want to save the file. If your object contains HTML data, this is probably not the beliefs you want. gsutil helps you out by looking at the extension of the local filename and inferring what type it should be. For instance, if y'all upload hello.txt, gsutil will automatically apply a Content-Type of text/plain. You tin fix the Content-Type (and a few other useful headers) on Deject Storage objects via the gsutil setmeta command.

You are non allowed to fix arbitrary headers on your objects. See gsutil aid metadata for the current list of allowed headers.

Another thing y'all might want to do is share an object with a item person, or grouping of people, who are not part of the project team. For example, you may wish to share an object with a set of your customers or stop users. The most efficient manner to do this is to beginning create a Google group, add together the individuals to that group, and grant that group permission to the object.

Yous tin can create a new Google group by going to http://groups.google.com and clicking Create Grouping. Because you're using this group to manage admission to resource, you lot'll desire to make sure that people can't add themselves to it without your permission. The group settings for this example are shown in Figure four-2. Notation that "Simply invited users" is selected for who can join the group.

Now that we have a group, we tin can grant information technology read permission to the object. First we restore the private canned ACL, then we apply the gsutil acl ch (change ACL) control to selectively add together a read permission for the grouping:

exam-vm$            gsutil acl set private gs://gce-oreilly-example/hi            Setting ACL on gs://gce-oreilly-example/how-do-you-do... test-vm$            gsutil acl ch -g gce-oreilly-instance@googlegroups.com:r \            gs://gce-oreilly-example/how-do-you-do            Updated ACL on gs://gce-oreilly-case/hi test-vm$            gsutil acl get gs://gce-oreilly-example/hello            [   {     "entity": "user-00b4[..]145f",     "entityId": "00b4[..]145f",     "role": "OWNER"   },   {     "electronic mail": "gce-oreilly-example@googlegroups.com",     "entity": "group-gce-oreilly-example@googlegroups.com",     "part": "READER"   } ]

The group now has read admission to the object. But to reiterate, while you can add individual users to an object's ACL, it's a best practise to but add together groups, so that when people join and leave teams, you can merely add together or remove them from the group, instead of having to update the ACL on potentially millions of objects.

Agreement ACLs

So far nosotros have only been looking at object ACLs. And you'll likewise notice that we've just seen 2 roles: Possessor and READER. That'southward because those are the but two roles that an object can accept, and they are concentric (i.due east., OWNER implies READER). The pregnant of READER is pretty self-explanatory—it means y'all're allowed to fetch the content of the object. Owner ways that, in addition to being able to fetch the object, you also accept the correct to modify the ACL.

Yous're probably wondering where the WRITER function for objects is. This is one of the differences between a BLOB storage system and a traditional filesystem. If you lot remember about what WRITER ways in a filesystem, it means y'all tin can open the file and modify its contents. There is no such operation in Cloud Storage, equally objects in Cloud Storage are immutable. Once an object is written to Cloud Storage, you tin can't append more information to it, or change just a few bytes in the eye of it. You can only delete the object or overwrite information technology with a completely new object. Thus, at that place is no Writer part for objects to determine who has permission to perform a write performance, because it is impossible to perform a write operation on an object.

Instead, the Author function exists at the bucket level in Cloud Storage. Having the WRITER role on a bucket means that you are allowed to create, delete, or overwrite objects in that saucepan.

The overwrite operation in Cloud Storage is diminutive and strongly consequent. Putting this in terms of gsutil operations, if you lot run:

              gsutil cp hullo.html gs://mybucket/hello.html

to copy a new version of howdy.html to your bucket, overwriting the version of hello.html that's already in that location, no clients volition e'er see a partially written hullo.html. Before the gsutil command completes, they will run across the old version, and afterwards the gsutil command completes, they volition see the new version, and at no time will they meet a "Non Found" error or partially written information.

It's besides important to understand that having the READER role on a bucket does not requite you the right to read the content of the objects in the bucket. That privilege is granted by having the READER part on the object itself, as already discussed. Instead, having the READER role on the saucepan gives you the right to get a list of the objects contained in that bucket.

Finally, the OWNER role on a bucket gives you the right to modify the ACL on the bucket and too to modify something chosen the "default object ACL," which we will discuss in the side by side section.

In this book, we utilise the ACL terminology from the Deject Storage JSON API, which assigns "roles" with "entities." If you're using the Cloud Storage XML API, yous'll be assigning "permissions" to "scopes," only the functionality is the same.

Using Default Object ACLs

The default object ACL is a very useful characteristic of Deject Storage, and is worth understanding well. In the case that the projection-private canned ACL is the ACL you desire for any object you add to a bucket, the default behavior demonstrated before is exactly what you desire. But let's say yous instead want to make all the objects in the bucket have an ACL that gives a specific group READ permission?

Associating a custom default object ACL with the bucket solves this trouble in a make clean and user-friendly way. What you're telling Cloud Storage with a default object ACL is "Please apply this ACL to every object that is written to this saucepan, unless the ACL is explicitly overridden." In fact, every bucket has a default object ACL, and the "default" default object ACL is of course, project-private, which is why the outset object we created had the project-private ACL. At present let's alter the default object ACL on our examination bucket to a custom default object ACL that provides READ access to a specific group:

examination-vm$            gsutil defacl get gs://gce-oreilly-case            [   {     "entity": "project-owners-1342[..]",     "projectTeam": {       "projectNumber": "1342[..]",       "squad": "owners"     },     "role": "Owner"   },   {     "entity": "project-editors-1342[..]",     "projectTeam": {       "projectNumber": "1342[..]",       "team": "editors"     },     "part": "OWNER"   },   {     "entity": "project-viewers-1342[..]",     "projectTeam": {       "projectNumber": "1342[..]",       "team": "viewers"     },     "role": "READER"   } ] test-vm$            gsutil defacl ch -m gce-oreilly-case@googlegroups.com:r \            gs://gce-oreilly-example            Updated default ACL on gs://gce-oreilly-case/ test-vm$            gsutil defacl get gs://gce-oreilly-example            [   {     "entity": "projection-owners-1342[..]",     "projectTeam": {       "projectNumber": "1342[..]",       "team": "owners"     },     "function": "OWNER"   },   {     "entity": "project-editors-1342[..]",     "projectTeam": {       "projectNumber": "1342[..]",       "squad": "editors"     },     "role": "OWNER"   },   {     "entity": "project-viewers-1342[..]",     "projectTeam": {       "projectNumber": "1342[..]",       "team": "viewers"     },     "role": "READER"   },   {     "electronic mail": "gce-oreilly-case@googlegroups.com",     "entity": "grouping-gce-oreilly-example@googlegroups.com",     "role": "READER"   } ]

We tin can use gsutil's defacl get, defacl set up, and defacl ch commands to view and modify the bucket's default object ACL exactly similar we've been using acl get, acl set, and acl ch to view and change object and bucket ACLs. This sequence of commands also demonstrates that the "default" default object ACL is indeed project-private, and and then shows how it can be modified. It is interesting to notice that the updated default object ACL does not mention the object's possessor like nosotros saw when we ran a like sequence of commands on the hello object. This is because nosotros don't know, in accelerate, who will create a particular object, and therefore don't know who the owner of that object will exist. The owner of an object ever has the Owner role on an object, and the default object ACL specifies which permissions should be added in addition to the OWNER role.

The default object ACL on the saucepan is only applied to newly created objects. To see this in action, start we ensure that the existing hello object is project-individual:

test-vm$            gsutil acl gear up project-private gs://gce-oreilly-example/hello            Setting ACL on gs://gce-oreilly-example/hello...

Now we can create a second object and run across that the default object ACL is applied to it:

test-vm$            gsutil cp howdy gs://gce-oreilly-example/hello2            Copying file://hullo [Content-Type=awarding/octet-stream]... test-vm$            gsutil acl become gs://gce-oreilly-example/hello2            [   {     "entity": "project-owners-1342[..]",     "projectTeam": {       "projectNumber": "1342[..]",       "team": "owners"     },     "role": "OWNER"   },   {     "entity": "project-editors-1342[..]",     "projectTeam": {       "projectNumber": "1342[..]",       "team": "editors"     },     "role": "Owner"   },   {     "entity": "project-viewers-1342[..]",     "projectTeam": {       "projectNumber": "1342[..]",       "squad": "viewers"     },     "role": "READER"   },   {     "e-mail": "gce-oreilly-instance@googlegroups.com",     "entity": "group-gce-oreilly-example@googlegroups.com",     "function": "READER"   },   {     "entity": "user-00b4[..]145f",     "entityId": "00b4[..]145f",     "part": "Owner"   } ]

Notation how the default object ACL has been expanded to include an owner, and that owner has been given full control of the new object.

We tin can confirm that changing the default object ACL on the bucket does non modify any existing object ACLs by examining the erstwhile howdy object:

examination-vm$            gsutil acl get gs://gce-oreilly-case/hello            [   {     "entity": "user-00b4[..]145f",     "entityId": "00b4[..]145f",     "role": "Possessor"   },   {     "entity": "project-owners-1342[..]",     "projectTeam": {       "projectNumber": "1342[..]",       "team": "owners"     },     "office": "OWNER"   },   {     "entity": "project-editors-1342[..]",     "projectTeam": {       "projectNumber": "1342[..]",       "team": "editors"     },     "function": "OWNER"   },   {     "entity": "project-viewers-1342[..]",     "projectTeam": {       "projectNumber": "1342[..]",       "team": "viewers"     },     "role": "READER"   } ]

Information technology is very of import to advisedly consider the structure of your ACLs during development of your awarding. If you do not, you lot may observe yourself in a situation where you demand to update millions of object ACLs, individually, which can be inconvenient at best.

Understanding Object Immutability

Equally was mentioned a bit earlier, all objects in Cloud Storage are immutable. This means they cannot be changed. You lot can overwrite an existing object with a new one, but unlike what you may be accustomed to in a traditional filesystem, y'all cannot "open" an object, "seek" to an arbitrary starting time in the object, "write" a serial of bytes, and "close" the file. If you want to overwrite an object, you have to upload a new object, from the commencement byte to the concluding byte.

Cloud Storage does allow you to compose existing objects into new objects, which tin can exist used to simulate a very limited form of "writing" to the middle or end of an existing object. Search the Cloud Storage documentation for "composite objects" for more details.

A corollary to the fact that all Cloud Storage objects are immutable is that you cannot read a partially written object. An object doesn't be in Cloud Storage until you have uploaded the concluding byte and received a 200 OK response from the service.

Agreement Strong Consistency

Cloud Storage provides a strong "read after write" consistency guarantee when you lot terminate writing an object. This ways that if you write an object and go a 200 OK response, y'all can be sure that anyone, anywhere in the earth who is authorized to read that object will be able to do so and will encounter the data yous just finished writing (not some previous version you may have just overwritten). This stands in dissimilarity to some deject storage systems where one user could write an object, receive a 200 OK response, and then another user who attempts to read that object, peradventure from a different location, could receive a 404 "Not Found" response, or worse, read a previous, out-of-engagement, version of that object.

However, there are two caveats to the stiff consistency of Cloud Storage. First, it does not apply to listing the contents of a bucket, or listing the buckets that belong to a project. Putting this in gsutil terms, if you gsutil cp an object into a bucket, it may accept a few seconds or longer for that object to announced in the results of a gsutil ls <bucket> control, simply if you know the proper name of the object, you are guaranteed to be able to gsutil stat that object. Similarly, when you create a saucepan, it may accept some fourth dimension earlier it shows upwardly in the results of gsutil ls.

The second caveat is that the statements on strong consistency do not apply if you let your object to exist cached using normal HTTP caching mechanisms. If you've immune your object to be cached, which too implies it is publicly readable, and then you may receive an out-of-date cached copy of an object instead of the one you just wrote, or y'all may still receive a re-create of the object from a cache after it has been deleted from Cloud Storage. This is expected HTTP behavior. For more details, look up the Cache-Control header, both in the Deject Storage documentation and in the HTTP specifications.

Summary

In this affiliate, nosotros learned that Cloud Storage is a Blob storage system, and what exactly that ways. So we saw how to use Deject Storage to create buckets and objects via the Cloud Console UI, the gsutil control-line tool, and via your own Python lawmaking using the Cloud Storage JSON API. Nosotros discussed the Deject Storage ACL model in detail, and touched on the concepts of object immutability and strong consistency.

Upward Next

Now that we've learned about the relatively unstructured storage mechanisms available to Compute Engine, namely Persistent Disk and Cloud Storage, we'll explore the more structured storage mechanisms of Deject SQL and Cloud Datastore.

millerbegile1974.blogspot.com

Source: https://www.oreilly.com/library/view/google-compute-engine/9781449361488/ch04.html

Miller Begile1974