Google Cloud Storage Upload Bytes to Blob
Chapter 4. Storage: Deject Storage
Google Cloud Storage provides the ability to store binary large objects (a.m.a. BLOBs) in the cloud. Yous can read or write BLOBs from anywhere on the Cyberspace, bailiwick to admission command that you can ascertain, including from any Compute Engine instance, no matter which zone it is in. Your BLOBs are besides stored in a durable, available, and secure mode, and are served past a massively scalable distributed infrastructure.
This makes Cloud Storage ideal for two key scenarios:
-
Storing and retrieving unstructured data from whatsoever Compute Engine example in any zone
-
Storing and retrieving unstructured information from both inside and outside Compute Engine (e.g., for sharing data with your customers)
A particularly of import special case of the second scenario is importing or exporting big quantities of data into and out of Google's Cloud Platform.
Information technology is important to understand the difference between durability and availability when it comes to storage. Roughly speaking, durability is defined by the probability that an object will not be permanently and irrevocably lost, whereas availability is defined by the probability that an object tin can be retrieved at a particular point in fourth dimension. Deject Storage provides a service-level understanding (SLA) that clearly specifies what level of availability the customer should expect and what sort of compensation the client will receive if that SLA is not met. Cloud Storage offers multiple classes of storage, at different prices, with unlike SLAs. However, in all the current offerings, the data is stored in a highly durable manner, which means that even if you lot tin can't access your data right now, it is not lost forever. It will go available again at some point in the (non too distant) future. Google does not provide an estimate of durability. In practice, with multiple copies of your data stored in multiple data centers, the probability of losing every copy of your data is extremely low, making the theoretical durability incredibly high.
Understanding Hulk Storage
BLOB has get a common industry term for a file of any type. While formats such as ASCII or Unicode are not generally considered binary files, they are made up of ones and zeros just like JPEG images, MPEG movies, Linux-executable files, or any other type of file. Cloud Storage is considered a Blob Storage system considering it treats all files every bit unstructured binary data.
Similarly, there's no particular reason why a BLOB needs to be particularly large. Cloud Storage is perfectly happy to store an object that has aught bytes of content. BLOBs in Cloud Storage can be up to v TB in size.
At this signal, you might be maxim to yourself, "A BLOB sounds a whole lot similar a file to me." And you would be exactly right. However, one reason the industry has taken to referring to this manner of storage every bit a "BLOB shop" instead of "filesystem," and thus calling the contents "BLOBs" or "objects" instead of "files" is because the give-and-take "filesystem" implies a great deal of functionality that and so-called "Hulk stores" typically do non provide. Not providing sure filesystem features offers some useful scalability tradeoffs. Later we've taken a tour of Cloud Storage, we'll return to this topic and examine information technology in more detail; merely for now, but proceed in listen that while Cloud Storage may look and experience a whole lot like a filesystem, especially when viewed through the lens of some higher-level tools (e.g., gsutil, Cloud Console), in the end, information technology's non a traditional filesystem, and if you await it to behave exactly like the filesystems you lot are accustomed to, you may get some surprises. For case, Deject Storage does not have directories, in the traditional sense.
The Cloud Storage documentation refers to an individual piece of information stored in Deject Storage every bit an "object," not as a "Hulk," and throughout this volume, we will use the term "object" as well.
Getting Started
Go to http://cloud.google.com/panel and select the project you created in the "Creating a Compute Engine Project" section in Chapter 1. In the lefthand navigation bar, click Storage > Cloud Storage > Storage browser. Assuming you have non used this project to admission Cloud Storage earlier, you should run into a welcome message, as shown in Figure iv-1.
Effigy 4-ane. Deject Storage welcome screen
Equally the UI suggests, your get-go act should be to create a new bucket. A bucket is a container for objects. Printing the "Create a saucepan" button, and enter a name for your new bucket. Normally, bucket names may comprise only lowercase letters, dashes, and underscores. Too, bucket names must be globally unique across the entire Cloud Storage service, and then if you choose something obvious, similar "examination," there'southward a skilful chance you'll get an error, because someone else already created a saucepan with that name.
As mentioned earlier, Cloud Storage does not back up the concept of directories in the traditional sense. While a bucket is a container for objects, similar to how a directory is a container for files, you cannot nest buckets inside buckets, the way you lot can nest subdirectories into parent directories in a bureaucracy like most filesystems provide.
If you lot've created a bucket, and so congratulations! Your project is set up up to utilise Cloud Storage. You tin employ the Storage browser in the Cloud Console to create and delete buckets, upload objects, download objects, delete objects, and accommodate object permissions and metadata. If yous're only working with a handful of objects, this is probably the quickest and easiest way to do what you demand. However, just as with Compute Engine, at that place are several ways to use Deject Storage, including an API and a control-line tool, gsutil
, which we examine next.
Introducing gsutil
In before chapters, you have been using the gcloud compute
command to interact with Compute Engine. gsutil
is the equivalent control for Cloud Storage. Allow'south create a Compute Engine instance called exam-vm
then we can take gsutil
for a spin. Note the employ of the --scopes
flag, which was introduced in Chapter 2:
$ gcloud compute instances create test-vm \ --zone us-central1-a --scopes storage-full [..]
Now nosotros can ssh
into your new instance to take gsutil
for a spin:
$ gcloud compute ssh test-vm --zone usa-central1-a [..]
The gsutil ls
command gives yous a list of your buckets, and nosotros can see the bucket we created using the Cloud Panel Web UI in the previous department. Annotation that because of the global bucket namespace, your bucket proper name will be different than the saucepan name shown in this sample output:
test-vm$ gsutil ls gs://gce-oreilly-example/
gsutil
uses a URI syntax to allow the user to express, "This is a local file" (e.g., file://path/to/local/file), versus, "This is an object in Google Cloud Storage" (east.g., gs://bucket/object), versus, "This is an object in another deject storage system" (e.grand., s3://saucepan/object). If yous don't specify a scheme on the URI, gsutil
assumes yous mean a local file.
At present allow's create our first object. Exist sure to use your bucket name with these commands, not the bucket name shown in this instance:
exam-vm$ echo 'Hullo Cloud Storage!' > hello examination-vm$ gsutil cp howdy gs://gce-oreilly-instance Copying file://howdy [Content-Type=application/octet-stream]... test-vm$ gsutil ls gs://gce-oreilly-example gs://gce-oreilly-example/hello examination-vm$ gsutil true cat gs://gce-oreilly-example/hello Hi Cloud Storage!
You have now stored and retrieved an object in Cloud Storage. If you go dorsum to the Cloud Console Spider web UI page y'all were using earlier and click your bucket name, you should now see the howdy
object there.
There's a off-white chip going on hither, so let's break it downwardly. First of all, you'll notice that yous did non need to install gsutil
. The images provided by Compute Engine already accept a version of gsutil
installed and ready to go.
There are many occasions where you'll want to utilize gsutil
exterior of a Compute Engine instance. For example, maybe you accept files on your development workstation that you want to upload to a Cloud Storage bucket, so y'all can so operate on that data from Compute Engine. Fortunately, if y'all followed the instructions in Chapter 1 to install the Cloud SDK, you already accept a copy of gsutil
installed on your workstation.
Next, you'll discover that you didn't demand to provide any credentials to gsutil
: no OAuth menstruum, no editing configuration files. Somehow, information technology obtained appropriate credentials to act on Cloud Storage. As we discussed in Chapter 2, this particular piece of magic is enabled via the --scopes
flag that y'all passed to gcloud compute
when you asked it to create the example. What you did with that flag is tell Compute Engine that you want programs running on this instance to exist able to use the service business relationship that was automatically created when y'all created your projection. The storage-total
part of that flag tells it what services you want those programs to be able to utilise (in this case, Cloud Storage). Finally, gsutil
understands that it is running in a Compute Engine instance configured this way and automatically acts every bit the project's service account, because that's obviously what yous intended if you created the instance using the --scopes
flag.
This is why you were able to ssh
into your freshly created Compute Engine example and immediately issue the gsutil ls
control and see a list of the buckets owned by the project that owns the Compute Engine example. If yous signed into a different example owned by a different project, you lot would see that project's buckets instead.
gsutil
is a very powerful tool that exposes every significant feature provided past Cloud Storage. Because this book is well-nigh Compute Engine and not Cloud Storage, there's a lot we don't have space to encompass. Withal, spending some quality time with gsutil
's extensive built-in help is highly recommended. The gsutil aid
command is your starting point.
One of the many useful features of gsutil
is that it can transparently work with local files (e.g., /tmp/my-local-file), objects in Deject Storage (e.g., gs://my-bucket/my-object), or objects in Amazon's S3 service (e.grand., s3://my-s3-saucepan/my-s3-object). This means the post-obit gsutil
command is legal and does exactly what y'all might expect (copy all the objects in an S3 saucepan to a Cloud Storage bucket):
$ gsutil cp s3://my-s3-bucket/* gs://my-gcs-bucket
If the bucket is large, yous'll probably want to use the -yard
(multithreaded) control-line switch as well. -l
(log) and -n
(noclobber) are likewise very handy for this sort of operation, as is the gsutil rsync
command. gsutil help
can tell you more than about those options and commands.
Using Cloud Storage from Your Code
Every bit useful as the Cloud Panel and gsutil
tin can be, at that place may come a time when you need to perform some operations on your objects or buckets in the context of a larger programme. Shelling out to gsutil
is ever an option, simply may not always exist the best option. Fortunately, Cloud Storage provides a full featured API that your programs can use to interact directly with your objects and buckets. In fact, it provides two APIs: an XML-oriented one and a JSON-oriented one.
The two APIs provide almost all the same functionality, just with different styles. If you're starting from scratch, the JSON API is probably the one you desire to use and is the 1 we will demonstrate here. It is consequent in style and structure with other Google APIs such as Google Maps, Google+, and Google Analytics. This makes it possible for Google to provide helpful customer libraries in many different languages and useful tools such as the Google Plugin for Eclipse. The consistency between Google APIs makes is easier for a programmer who is familiar with 1 Google API to be immediately productive with a different Google API.
The XML API, not coincidentally, closely resembles Amazon'south S3 Balance API, making it piece of cake for developers to add support for Cloud Storage to existing tools, libraries, and other code that was originally written for employ with S3. If yous accept some existing code that works with S3 and y'all want to migrate to Deject Storage, the XML API makes that easier.
Before writing whatever lawmaking, you need to install the Google APIs Client Library for Python. These Python libraries make it easier to work with many different Google APIs, not just Cloud Storage. pip
is a dandy tool for installing Python packages and is available via apt-become
on Debian-based Compute Engine instances. ssh
into the test-vm
example you lot created before and run these commands:
examination-vm$ sudo apt-become update [..] test-vm$ sudo apt-get install python-pip [..] test-vm$ sudo pip install --upgrade google-api-python-client Downloading/unpacking google-api-python-client [..] Successfully installed google-api-python-client httplib2 Cleaning upward...
The following control downloads a simple Python program that demonstrates how to access an object in Cloud Storage:
exam-vm$ gsutil cp gs://gce-oreilly/hello_cloud_storage.py . Copying gs://gce-oreilly/hello_cloud_storage.py...
Here is the content of hello_cloud_storage.py:
import
httplib2
from
apiclient.discovery
import
build
from
oauth2client
import
gce
# These two lines take care of all the ofttimes tricky authorization
# steps by getting us an Http object that automatically adds the
# appropriate Authorization: header to our requests, using the
# service business relationship associated with the project that owns the Compute
# Engine instance on which this program is running. Note that for
# this to work, the --scopes=storage-full flag must exist specified to
# gcloud compute when the case was created.
credentials
=
gce
.
AppAssertionCredentials
(
scope
=
'https://world wide web.googleapis.com/auth/devstorage.read_write'
)
http
=
credentials
.
qualify
(
httplib2
.
Http
())
# The Google APIs library dynamically builds a Python object that
# understands the operations provided by Cloud Storage. Every API
# has a proper noun and a version (in this example, 'storage' and 'v1').
# Depending on when you are reading this, you may observe there is a
# newer version of the 'storage' API available.
storage
=
build
(
'storage'
,
'v1'
)
# Google APIs expose collections, which typically expose methods that
# are common across many APIs, such every bit listing(), or get(). In the case
# of Cloud Storage, the get() method on the objects drove gets
# an object'due south metadata, and the get_media() method gets an object'southward
# data.
request
=
storage
.
objects
()
.
get_media
(
bucket
=
'gce-oreilly'
,
object
=
'hello'
)
# Also annotation that get_media(), and other methods, do non perform the
# action straight. They instead render an HttpRequest object that can
# exist used to perform the activeness. This is important, because it gives
# us the opportunity to authorize our request past passing in the Http
# object we created earlier that knows near our service account.
request
.
execute
(
http
=
http
)
# The previous phone call to get_media() fetched the object'southward data. This
# call to get() will fetch the object's metadata. The Google API
# libraries conveniently accept care of converting the response from
# the JSON used on the network to a Python lexicon, which nosotros can
# iterate over to print the object'due south metadata to the console.
request
=
storage
.
objects
()
.
become
(
bucket
=
'gce-oreilly'
,
object
=
'hello'
)
metadata
=
asking
.
execute
(
http
=
http
)
for
key
,
value
in
metadata
.
iteritems
():
cardinal
+
"="
+
str
(
value
)
When you run this through the Python interpreter, you should meet the contents of gs://gce-oreilly/hi
(in URI parlance) and the metadata associated with the object:
test-vm$ python hello_cloud_storage.py Hello Cloud Storage! kind=storage#object contentType=application/octet-stream name=hullo etag=CLDgk+KZhrwCEAI= generation=1389995772670000 md5Hash=ocbFPgjShy+EHAb+0DpjJg== bucket=gce-oreilly [..] size=21
While dumping the contents and metadata of a unmarried object to the console is patently the simplest possible Deject Storage programming task, this withal demonstrates several key points. Kickoff, Compute Engine makes it like shooting fish in a barrel to use Cloud Storage via the born service account support. Second, using the Cloud Storage JSON API ways you practise not need to laboriously assemble correctly formatted custom HTTP requests. The Google APIs client library understands the operations available and how to codify the appropriate requests. Third, the Google APIs library handles translating JSON responses into convenient native Python dictionaries.
Configuring Access Control
Up to this point, we accept seen how to create buckets and objects and read their contents using the Deject Panel Web UI, gsutil
, and your own custom Python code. We have e'er been interim either as ourselves every bit an possessor of our projection, or as the automatically created project service business relationship, which is also a member of the projection. Unsurprisingly, owners and members of the project accept, past default, the appropriate access rights to create buckets owned by that project, and create and read objects in those buckets. Where admission command gets interesting, and where Cloud Storage gets particularly useful, is when y'all desire to requite specific rights to people or service accounts that are not part of your project. This is also where we volition starting time to see some of the meaning differences betwixt Cloud Storage and traditional filesystems.
Every object in Deject Storage has an access control listing (ACL). You can use gsutil acl go
to come across the ACL practical to an object.
What's an ACL? An ACL is a list of people or groups that you're granting permission to perform specific operations on an object. ACLs are more explicit and flexible than the permission bits you may be accustomed to working with on UNIX-style filesystems, which just permit you to specify permissions for the file'southward "possessor," "group," and "anybody else," because you are not limited to granting permissions simply to a single individual (the owner) and a single grouping.
If yous are non already logged into your examination-vm
, utilise the gcutil ssh
command to do so and try the post-obit example (using your bucket name instead of the one shown hither, of form):
test-vm$ gsutil acl become gs://gce-oreilly-case/hello [ { "entity": "projection-owners-1342[..]", "projectTeam": { "projectNumber": "1342[..]", "squad": "owners" }, "office": "Possessor" }, { "entity": "project-editors-1342[..]", "projectTeam": { "projectNumber": "1342[..]", "team": "editors" }, "role": "Owner" }, { "entity": "projection-viewers-1342[..]", "projectTeam": { "projectNumber": "1342[..]", "squad": "viewers" }, "role": "READER" }, { "entity": "user-00b4[..]145f", "entityId": "00b4[..]145f", "role": "OWNER" } ]
How to read this? Observe that "entities" are being assigned "roles." In this particular case, the first 3 entities are groups that correspond to the diverse team members of your project whom you've added with "Is Owner," or "Tin Edit," or "Can View" permissions. This is what Cloud Storage calls a projection-private
"canned" ACL. At that place are other and so-called canned ACLs that are useful for mutual scenarios. The project-private
canned ACL is a reasonable default ACL for many situations, giving the project team members reasonable default rights, while making certain that no 1 outside the project can access the object. You tin can employ a dissimilar canned ACL via gsutil
. For case, if you want to make the object completely private to yourself, the private
canned ACL will practice the play a trick on:
examination-vm$ gsutil acl gear up individual gs://gce-oreilly-example/hello Setting ACL on gs://gce-oreilly-example/hullo... examination-vm$ gsutil acl get gs://gce-oreilly-example/hullo [ { "entity": "user-00b4[..]145f", "entityId": "00b4[..]145f", "role": "Possessor" } ]
Y'all're now the only one in the globe who can admission this item object. You have the correct to modify the ACL because you are the OWNER
of the object. Similarly, if you desire to share your object with everyone, you tin can utilise the public-read
canned ACL:
test-vm$ gsutil acl set public-read gs://gce-oreilly-example/how-do-you-do Setting ACL on gs://gce-oreilly-example/how-do-you-do... test-vm$ gsutil acl become gs://gce-oreilly-example/howdy [ { "entity": "user-00b4[..]145f", "entityId": "00b4[..]145f", "office": "Owner" }, { "entity": "allUsers", "role": "READER" } ]
Y'all can now meet that the entity allUsers
has READER
role for this object. Objects that give allUsers
the READER
role do not require authentication to be accessed. This means that anyone in the world can navigate a web browser to http://storage.googleapis.com/gce-oreilly-case/hello , and volition be able to fetch the object. If you desire this to work the way most users would expect, you may want to set an appropriate Content-Type
(a.g.a. MIME blazon) on your object, so the browser will know what to do with it. If you lot do not set the Content-Type
, Deject Storage uses the default of binary/octet-stream
. About browsers will interpret binary/octet-stream
every bit a file to be downloaded and inquire users where they want to save the file. If your object contains HTML data, this is probably not the beliefs you want. gsutil
helps you out by looking at the extension of the local filename and inferring what type it should be. For instance, if y'all upload hello.txt, gsutil
will automatically apply a Content-Type
of text/plain
. You tin fix the Content-Type
(and a few other useful headers) on Deject Storage objects via the gsutil setmeta
command.
You are non allowed to fix arbitrary headers on your objects. See gsutil aid metadata
for the current list of allowed headers.
Another thing y'all might want to do is share an object with a item person, or grouping of people, who are not part of the project team. For example, you may wish to share an object with a set of your customers or stop users. The most efficient manner to do this is to beginning create a Google group, add together the individuals to that group, and grant that group permission to the object.
Yous tin can create a new Google group by going to http://groups.google.com and clicking Create Grouping. Because you're using this group to manage admission to resource, you lot'll desire to make sure that people can't add themselves to it without your permission. The group settings for this example are shown in Figure four-2. Notation that "Simply invited users" is selected for who can join the group.
Figure 4-two. Example Google Group settings
Now that we have a group, we tin can grant information technology read permission to the object. First we restore the private
canned ACL, then we apply the gsutil acl ch
(change ACL) control to selectively add together a read permission for the grouping:
exam-vm$ gsutil acl set private gs://gce-oreilly-example/hi Setting ACL on gs://gce-oreilly-example/how-do-you-do... test-vm$ gsutil acl ch -g gce-oreilly-instance@googlegroups.com:r \ gs://gce-oreilly-example/how-do-you-do Updated ACL on gs://gce-oreilly-case/hi test-vm$ gsutil acl get gs://gce-oreilly-example/hello [ { "entity": "user-00b4[..]145f", "entityId": "00b4[..]145f", "role": "OWNER" }, { "electronic mail": "gce-oreilly-example@googlegroups.com", "entity": "group-gce-oreilly-example@googlegroups.com", "part": "READER" } ]
The group now has read admission to the object. But to reiterate, while you can add individual users to an object's ACL, it's a best practise to but add together groups, so that when people join and leave teams, you can merely add together or remove them from the group, instead of having to update the ACL on potentially millions of objects.
Agreement ACLs
So far nosotros have only been looking at object ACLs. And you'll likewise notice that we've just seen 2 roles: Possessor
and READER
. That'southward because those are the but two roles that an object can accept, and they are concentric (i.due east., OWNER
implies READER
). The pregnant of READER
is pretty self-explanatory—it means y'all're allowed to fetch the content of the object. Owner
ways that, in addition to being able to fetch the object, you also accept the correct to modify the ACL.
Yous're probably wondering where the WRITER
function for objects is. This is one of the differences between a BLOB storage system and a traditional filesystem. If you lot remember about what WRITER
ways in a filesystem, it means y'all tin can open the file and modify its contents. There is no such operation in Cloud Storage, equally objects in Cloud Storage are immutable. Once an object is written to Cloud Storage, you tin can't append more information to it, or change just a few bytes in the eye of it. You can only delete the object or overwrite information technology with a completely new object. Thus, at that place is no Writer
part for objects to determine who has permission to perform a write performance, because it is impossible to perform a write operation on an object.
Instead, the Author
function exists at the bucket level in Cloud Storage. Having the WRITER
role on a bucket means that you are allowed to create, delete, or overwrite objects in that saucepan.
The overwrite operation in Cloud Storage is diminutive and strongly consequent. Putting this in terms of gsutil
operations, if you lot run:
gsutil cp hullo.html gs://mybucket/hello.html
to copy a new version of howdy.html to your bucket, overwriting the version of hello.html that's already in that location, no clients volition e'er see a partially written hullo.html. Before the gsutil
command completes, they will run across the old version, and afterwards the gsutil
command completes, they volition see the new version, and at no time will they meet a "Non Found" error or partially written information.
It's besides important to understand that having the READER
role on a bucket does not requite you the right to read the content of the objects in the bucket. That privilege is granted by having the READER
part on the object itself, as already discussed. Instead, having the READER
role on the saucepan gives you the right to get a list of the objects contained in that bucket.
Finally, the OWNER
role on a bucket gives you the right to modify the ACL on the bucket and too to modify something chosen the "default object ACL," which we will discuss in the side by side section.
In this book, we utilise the ACL terminology from the Deject Storage JSON API, which assigns "roles" with "entities." If you're using the Cloud Storage XML API, yous'll be assigning "permissions" to "scopes," only the functionality is the same.
Using Default Object ACLs
The default object ACL is a very useful characteristic of Deject Storage, and is worth understanding well. In the case that the projection-private
canned ACL is the ACL you desire for any object you add to a bucket, the default behavior demonstrated before is exactly what you desire. But let's say yous instead want to make all the objects in the bucket have an ACL that gives a specific group READ
permission?
Associating a custom default object ACL with the bucket solves this trouble in a make clean and user-friendly way. What you're telling Cloud Storage with a default object ACL is "Please apply this ACL to every object that is written to this saucepan, unless the ACL is explicitly overridden." In fact, every bucket has a default object ACL, and the "default" default object ACL is of course, project-private
, which is why the outset object we created had the project-private
ACL. At present let's alter the default object ACL on our examination bucket to a custom default object ACL that provides READ
access to a specific group:
examination-vm$ gsutil defacl get gs://gce-oreilly-case [ { "entity": "project-owners-1342[..]", "projectTeam": { "projectNumber": "1342[..]", "squad": "owners" }, "role": "Owner" }, { "entity": "project-editors-1342[..]", "projectTeam": { "projectNumber": "1342[..]", "team": "editors" }, "part": "OWNER" }, { "entity": "project-viewers-1342[..]", "projectTeam": { "projectNumber": "1342[..]", "team": "viewers" }, "role": "READER" } ] test-vm$ gsutil defacl ch -m gce-oreilly-case@googlegroups.com:r \ gs://gce-oreilly-example Updated default ACL on gs://gce-oreilly-case/ test-vm$ gsutil defacl get gs://gce-oreilly-example [ { "entity": "projection-owners-1342[..]", "projectTeam": { "projectNumber": "1342[..]", "team": "owners" }, "function": "OWNER" }, { "entity": "project-editors-1342[..]", "projectTeam": { "projectNumber": "1342[..]", "squad": "editors" }, "role": "OWNER" }, { "entity": "project-viewers-1342[..]", "projectTeam": { "projectNumber": "1342[..]", "team": "viewers" }, "role": "READER" }, { "electronic mail": "gce-oreilly-case@googlegroups.com", "entity": "grouping-gce-oreilly-example@googlegroups.com", "role": "READER" } ]
We tin can use gsutil
's defacl get
, defacl set up
, and defacl ch
commands to view and modify the bucket's default object ACL exactly similar we've been using acl get
, acl set
, and acl ch
to view and change object and bucket ACLs. This sequence of commands also demonstrates that the "default" default object ACL is indeed project-private
, and and then shows how it can be modified. It is interesting to notice that the updated default object ACL does not mention the object's possessor like nosotros saw when we ran a like sequence of commands on the hello
object. This is because nosotros don't know, in accelerate, who will create a particular object, and therefore don't know who the owner of that object will exist. The owner of an object ever has the Owner
role on an object, and the default object ACL specifies which permissions should be added in addition to the OWNER
role.
The default object ACL on the saucepan is only applied to newly created objects. To see this in action, start we ensure that the existing hello
object is project-individual:
test-vm$ gsutil acl gear up project-private gs://gce-oreilly-example/hello Setting ACL on gs://gce-oreilly-example/hello...
Now we can create a second object and run across that the default object ACL is applied to it:
test-vm$ gsutil cp howdy gs://gce-oreilly-example/hello2 Copying file://hullo [Content-Type=awarding/octet-stream]... test-vm$ gsutil acl become gs://gce-oreilly-example/hello2 [ { "entity": "project-owners-1342[..]", "projectTeam": { "projectNumber": "1342[..]", "team": "owners" }, "role": "OWNER" }, { "entity": "project-editors-1342[..]", "projectTeam": { "projectNumber": "1342[..]", "team": "editors" }, "role": "Owner" }, { "entity": "project-viewers-1342[..]", "projectTeam": { "projectNumber": "1342[..]", "squad": "viewers" }, "role": "READER" }, { "e-mail": "gce-oreilly-instance@googlegroups.com", "entity": "group-gce-oreilly-example@googlegroups.com", "function": "READER" }, { "entity": "user-00b4[..]145f", "entityId": "00b4[..]145f", "part": "Owner" } ]
Notation how the default object ACL has been expanded to include an owner, and that owner has been given full control of the new object.
We tin can confirm that changing the default object ACL on the bucket does non modify any existing object ACLs by examining the erstwhile howdy
object:
examination-vm$ gsutil acl get gs://gce-oreilly-case/hello [ { "entity": "user-00b4[..]145f", "entityId": "00b4[..]145f", "role": "Possessor" }, { "entity": "project-owners-1342[..]", "projectTeam": { "projectNumber": "1342[..]", "team": "owners" }, "office": "OWNER" }, { "entity": "project-editors-1342[..]", "projectTeam": { "projectNumber": "1342[..]", "team": "editors" }, "function": "OWNER" }, { "entity": "project-viewers-1342[..]", "projectTeam": { "projectNumber": "1342[..]", "team": "viewers" }, "role": "READER" } ]
Information technology is very of import to advisedly consider the structure of your ACLs during development of your awarding. If you do not, you lot may observe yourself in a situation where you demand to update millions of object ACLs, individually, which can be inconvenient at best.
Understanding Object Immutability
Equally was mentioned a bit earlier, all objects in Cloud Storage are immutable. This means they cannot be changed. You lot can overwrite an existing object with a new one, but unlike what you may be accustomed to in a traditional filesystem, y'all cannot "open" an object, "seek" to an arbitrary starting time in the object, "write" a serial of bytes, and "close" the file. If you want to overwrite an object, you have to upload a new object, from the commencement byte to the concluding byte.
Cloud Storage does allow you to compose existing objects into new objects, which tin can exist used to simulate a very limited form of "writing" to the middle or end of an existing object. Search the Cloud Storage documentation for "composite objects" for more details.
A corollary to the fact that all Cloud Storage objects are immutable is that you cannot read a partially written object. An object doesn't be in Cloud Storage until you have uploaded the concluding byte and received a 200 OK response from the service.
Agreement Strong Consistency
Cloud Storage provides a strong "read after write" consistency guarantee when you lot terminate writing an object. This ways that if you write an object and go a 200 OK response, y'all can be sure that anyone, anywhere in the earth who is authorized to read that object will be able to do so and will encounter the data yous just finished writing (not some previous version you may have just overwritten). This stands in dissimilarity to some deject storage systems where one user could write an object, receive a 200 OK response, and then another user who attempts to read that object, peradventure from a different location, could receive a 404 "Not Found" response, or worse, read a previous, out-of-engagement, version of that object.
However, there are two caveats to the stiff consistency of Cloud Storage. First, it does not apply to listing the contents of a bucket, or listing the buckets that belong to a project. Putting this in gsutil
terms, if you gsutil cp
an object into a bucket, it may accept a few seconds or longer for that object to announced in the results of a gsutil ls <bucket>
control, simply if you know the proper name of the object, you are guaranteed to be able to gsutil stat
that object. Similarly, when you create a saucepan, it may accept some fourth dimension earlier it shows upwardly in the results of gsutil ls
.
The second caveat is that the statements on strong consistency do not apply if you let your object to exist cached using normal HTTP caching mechanisms. If you've immune your object to be cached, which too implies it is publicly readable, and then you may receive an out-of-date cached copy of an object instead of the one you just wrote, or y'all may still receive a re-create of the object from a cache after it has been deleted from Cloud Storage. This is expected HTTP behavior. For more details, look up the Cache-Control
header, both in the Deject Storage documentation and in the HTTP specifications.
Summary
In this affiliate, nosotros learned that Cloud Storage is a Blob storage system, and what exactly that ways. So we saw how to use Deject Storage to create buckets and objects via the Cloud Console UI, the gsutil
control-line tool, and via your own Python lawmaking using the Cloud Storage JSON API. Nosotros discussed the Deject Storage ACL model in detail, and touched on the concepts of object immutability and strong consistency.
Upward Next
Now that we've learned about the relatively unstructured storage mechanisms available to Compute Engine, namely Persistent Disk and Cloud Storage, we'll explore the more structured storage mechanisms of Deject SQL and Cloud Datastore.
Source: https://www.oreilly.com/library/view/google-compute-engine/9781449361488/ch04.html
Post a Comment for "Google Cloud Storage Upload Bytes to Blob"