Protein size - how do I measure it, and why is it important?

Protein size how to measure and why

To understand proteins and their function, we have to understand the way they fold, aggregate and interact. Conformation is key to protein function and can be revealed by measuring size.

Understanding protein size can reveal insights into protein function

The primary sequence of proteins encodes their structure, but not uniquely. A single sequence can adopt different folds or conformations. These may play a role in function, such as the domain movements seen in many enzymes, regulation, or may represent misfolded species such as aggregates.

For example, insulin is produced as a hexamer in the body, which is stable and so can be stored over time, but it is the much smaller and fast-reacting monomer form that is biologically active. Insulin can also form amyloid aggregates which can cause severe problems for insulin therapy patients, as the fibrils build up at injection sites. Knowledge of insulin’s size and associated oligomeric state therefore, can provide insights into protein function and state.

In general, the detection of protein aggregation is one of most common uses of protein sizing. As we have seen with insulin, aggregation can affect the efficacy of biological drugs, but the aggregation of native proteins in situ can have extensive biological implications. For example, the aggregation of alpha-synuclein is frequently observed in Parkinson’s disease. As such, the ability to monitor aggregation and understanding the relationship between size and function is critically important in biomedical research, as well as other fields such as drug manufacturing.

Protein size changes with folding, multimers and aggregation

Protein size can change dramatically with folded, unfolded, multimer and aggregate forms of the same molecule. The hydrodynamic radius (Rh) indicated here by the dotted lines, is a common method of measuring protein size. 

Some proteins can also change the way they are folded, and bring about a change in function by “revealing” new interaction sites. The protein Lymphotactin (otherwise known as XCL1) exists in an equilibrium between two differently-sized forms at physiological conditions. The change between the two forms is very profound: residues on the inside of one fold are exposed in the other, and as a result one of the conformations is monomeric and interacts with a cell surface receptor XCR1, and the other is a dimeric binder of glycosaminoglycans. This is a clear example of how information about the size leads to information about the conformation and ultimately about the function of the protein.

As well as providing insights into function of single proteins, protein sizing can also reveal mechanisms in highly complex biological systems. For example, accurate sizing found that the size of von Willebrand Factor (VWF) protein is crucial in inducing aggregation of red blood cells, a process involved in clotting.

Avoiding confusion – how do you size proteins?

When studying size, care should be taken to get definitions right.

Size usually refers to one of three parameters: molecular weight (MW), hydrodynamic radius or Stokes radius (Rh), or the radius of gyration (RG). Rh is defined as the radius of a hard sphere that diffuses at the same rate as the sample particle. RG is the weight-averaged distance from the centre of the molecule to each mass element. Due to the calculation methods, RG is more structure dependent than Rh. The ratio of RG to Rh is used to give an indication of structure; in globular proteins RG is smaller than Rh, while in non-spherical or elongated structures RG is larger. The MW is the total mass of all the atoms in one protein.

Another thing to note is the conversion between angstroms, used in crystallography, and nanometres which are used to express Rh or the RG. One angstrom equals 0.1 nanometres.

Knowing which definition of size is of interest is important when choosing the appropriate sizing method. Below are some of the most frequently used techniques for measuring size along with some of their advantages and limitations.

Common approaches for measuring protein size

Analytical ultracentrifugation (AUC)

This technique creates a high gravity field that causes the protein in solution to sediment and settle with a clear boundary to the rest of the solution. As the sample is being spun, absorption and refractive index scans follow the movement of the particles at the boundary and yield a parameter called the sedimentation co-efficient . That is linked to the size of the molecule, and this can be calculated both in terms of molecular weight and hydrodynamic radius.

AUC allows the composition of mixtures of proteins to be characterised, for example, the determination of the abundance of oligomeric species in a sample. Additional information about shape and molecular interactions can also be obtained.

However, AUC is expensive, very slow (up to 24h per sample) and requires highly trained staff to get quality data. The relationship between sedimentation velocity and size also is not straight-forward and is affected by protein shape, concentration and ionic strength. As such careful modelling of parameters such as shape and solution viscosity are needed to get accurate results.

Dynamic light scattering (DLS)

DLS measures light scattered by particles in solution. The light coming off each particle interferes with the light coming off the other particles and either enhances or reduces the signal. The enhancement or reduction will change with particle diffusion and this causes fluctuations in signal. The diffusion coefficient can be calculated from these fluctuations, which in turn can be used to determine the hydrodynamic radius using the Stokes-Einstein equation.

DLS is an easy, straight-forward method for the non-expert user, but it is a very low resolution technique; if you have a mixture of two proteins, they must differ in size by 50% to resolve the peaks. It is also biased by larger particles in the mix because scattering intensity is proportional to the particle diameter to the sixth power. This means that larger particles will be responsible for a disproportional amount of light and thus be overestimated in the size distribution. Even a small amount of aggregation - or dust - could result in a bias.

Taylor dispersion analysis (TDA)

For Taylor dispersion analysis, a small volume of sample is injected into a laminar flow of buffer and allowed to flow for a defined pathlength before detection using UV absorption (see “Protein Quantification” for an in-depth explanation). The broadening of the peak as it travels through the capillary is caused by Taylor dispersion, and the degree of Taylor dispersion is governed by the size of the protein(s) in the sample.

When a fluid is subjected to laminar flow, it adopts a parabolic flow profile in which the centre of the channel moves fastest and the fluid molecules on the walls of the channel don’t move at all.  Then, the sample will experience radial diffusion which creates a more spherical shape. The combination of these two movements, convection and radial diffusion, is called the Taylor dispersion. But, seeing as all samples experience the same convection because that stems from the flow, the diffusion will be the only thing to change. So the concentration measured by UV, which tracks the sample movement, can simply be corrected for convection to give a measure of diffusion only. From this, the hydrodynamic radius can be determined.

Taylor dispersion analysis can be performed using a Viscosizer and will provide both the size and the viscosity. However, Viscosizers are comparatively expensive relative to other sizing technologies, require high volumes and concentrations of sample, and suffer from the drawbacks of determining concentration based on UV absorption – specifically, a susceptibility to contamination and the need to know the extinction coefficient of the sample protein.

Pulse-field-gradient (PFG) Nuclear Magnetic Resonance (NMR)

PFG makes NMR signals diffusion-dependent. In normal NMR, nuclear spins rotate in a cone-shape along the axis of the magnetic field with a specific frequency. In PFG NMR, a pulse converts that rotation into a helical shape. To “un-wind” the helix, another pulse is applied after a short time. However, if the nuclear spins have moved in space during that time, i.e. by diffusion, the un-winding is not perfect, and the NMR signal is reduced. In this way the NMR spectrum can give information about diffusion, which again is linked to hydrodynamic radius.

The main limitation of this technique is that it requires access to a suitable NMR machine. Acquiring one comes with high cost and footprint, as well as the need to employ dedicated technicians. Small, benchtop spectrometers generally cannot do PFG.

Gel electrophoresis

Gel electrophoresis can be used to determine molecular weight by using SDS-PAGE. The negatively charged SDS (sodium dodecyl sulphate) binds most soluble proteins and denatures them, meaning that their mobility in the gel is only dependent on molecular weight. It can then be determined by comparison with a calibration curve. While easy to carry out, the sizing accuracy varies between proteins, depending on the structure and sequence of the proteins.

In Native-PAGE, which does not denature proteins, the mobility depends on both hydrodynamic radius and charge. In both SDS and native PAGE, the use of a matrix can skew results if even small interactions between the matrix and protein occur.

Native Mass Spectroscopy (MS)

MS is an analytical technique that ionises and separates the molecules based on their mass-to-charge ratio, from which molecular weight can be determined. For Native MS, proteins are kept in their native state prior to ionisation, which is usually done by electrospray ionisation (ESI).  The ions are then accelerated and deflected by a magnetic field according to their mass-to-size ratio, before ultimate detection.

Native MS can also be combined with ion mobility (IM), a technique which separates ionised molecules based on their mobility in a gas. The mobility depends on the collisional cross section (CCS), which is the maximum distance from a particle that another particle can be, if a collision between the particles is to occur. It’s related to the particle’s size and shape.

Native MS is an information-rich technique, especially when combined with IM, but similarly to PFG NMR, it requires both expensive and sizeable equipment. Further to this, the state of the intact proteins following ionisation remains the subject of debate.

Size Exclusion Chromatography with Multi-Angle Light Scattering (SEC-MALS)

In SEC, molecules are separated based on hydrodynamic volume. This is achieved by passing the sample through a column containing a stationary phase, or matrix, with pores: smaller molecules diffuse furthest into the matrix, and so are the last to be eluted.

The method can be extended with MALS, which measures scattered light at different angles to the radiating beam. The intensity will increase when the light encounters a particle of different refractive index, such as a protein, and the increase correlates with concentration and molecular weight. Hence, the molecular weight can be determined if the concentration is known or through the use of a calibration plot. For bigger particles, the intensity will also depend on detection angle, and this can be used to estimate the radius of gyration.

This method is provides lots of information about the sample but it is slow and can require large volumes and concentrations of sample. Any interaction of the protein with the stationary phase will affect accuracy, with samples that spend time in the pores appearing “smaller”. Another issue is if particles do not differ greatly by size, there will not be clear separation in their elution times.

DNA nanolevers

This method uses DNA ‘levers’ attached to a surface, to which proteins of interest are attached. The levers are induced to sway over the surface by an applied voltage, and that swaying can be followed because the DNA has a fluorescent probe attached. It can then be compared to a theoretical model from which a corresponding hydrodynamic radius can be calculated.

Other information, such as binding kinetics and conformational changes, can also be obtained from this system. But for each protein, the nanolevers will have to be specifically functionalised, requiring specific knowledge about the protein’s binding behaviour.

Microfluidic diffusional sizing (MDS)

MDS on the Fluidity One-W determines protein size by measuring diffusion in laminar flow. The ratio between the pre-labelled sample and analyte stream is used to calculate the diffusion coefficient, and then the hydrodynamic radius. The protein of interest can be measured in buffers as well as more complex samples such as cell medium and cell lysate.

Chip flow diagram One-W

The principle of MDS Microfluidic Diffusional Sizing - Here the size and interactions of a fluorescently pre-labelled protein of interest can be measured in near native state.

It circumvents many of the limitations of other available methods: it doesn’t need a matrix like PAGE or SEC, it does not rely on protein absorption like the viscosizer, dedicated technicians are not required and it is fast – only 7-8 minutes. Plus, proteins are analysed in near-native conditions. If you need fast, sensitive near-native sizing, Fluidity One is a great choice.

Choosing the right technology for protein sizing

As discussed above, each of the various technologies have strengths and limitations – ranging from the sensitivity of the machine to the degree of expertise required to use the approach, as summarised in Table 1 below.

Method of detection Definition of size Sizing Range Cost per sample Limitations
DLS Light scattering Hydrodynamic radius 0.3 nm - 10 µm

No consumables.

DLS cost = £37,000

Biased towarsd large species - which can mask the presence of small species in the system
Analytical Ultracentrifugation (AUC) Sedimentation co-efficient Molecular weight and hydrodynamic radius 100 Da - 1.5 MDa Takes 24 hours per sample. Data analysis requires a high degree of expertise.
Taylor Dispersion (Viscosizer) Taylor Dispersion Hydrodynamic radius 0.2 - 50 nm

No consumables.

Instrument = £60,000

Cannot work below 1 mg/mL concentration. Requires larger sample volumes.
SEC-MALS Chromatography and light scattering Molecular weight and radius of gyration

1 kDa - 10 MDa

Cannot size below 15 nm

Columnds = >£100 but reusable.

AKTA system = £100,000

Potential secondary interactions with the stationary phase.

Compatibility of non-interacting mobile phase with solubility of protein.

Equilibrium/loss of concentration in the column leading to dissociation.

Gel Electrophoresis Electrophoretic mobility Molecular weight Gel dependent Precast gels = £25 per 12 samples

In SDS-page proteins are denatured.

Any interaction with the matrix will skew results.

PFG NMR Changes in NMR signal Hydrodynamic radius

1 pm - 50 µm

(Diffusion constant; 10-7 to 10-14)

~£75 per hour at University NMR services, plus consumables

No benchtop solution.

Requires trained technicians.

DNA Nanolevers Hydrodynamic friction detected through fluorescence Hydrodynamic radius Consumables include the nanolever 'chips' along with special buffer, glassware and cleaning materials. Knowledge of the protein is necessary, as the nanolever has to be specifically functionalised.
Microfluidic Diffusional Sizing (MDS) (Fluidity One-W) Laminar flow diffusion detected through fluorescence Hydrodynamic radius 0.5 - 20 nm

£4 per consumable chip

Requires pre-labelling

Specific application and end user will influence the choice of technology, and may in fact benefit from the use of multiple corroborating technologies to build confidence in observations. Because all the technologies above use slightly different measurements to determine size, results won’t always overlay exactly – and it is wise not to rely on a single measurement alone nor discount outliers out of hand.

We’ve observed an excellent correlation between results generated using MDS versus DLS and TDA, but have found improved accuracy with MDS at lower concentrations on proteins tested. This, combined with the Fluidity One-W's speed, precision and ease of use make it an excellent addition to any suite of sizing tools.

Contact us if you have any questions about protein size determination, and we'll be happy to help.

  • Publications and resources

    View all
    • Blog12 February 2019

      Protein DLS result not as expected? Try these troubleshooting tips

      If your DLS result for a protein sample is not what you expected, it could be down to one of these common reasons.

      Use this list to troubleshoot your DLS protein results fast.

    • Blog04 February 2019

      Biophysics for Biologists

      Protein X is a key suspect in causing disease Y. So how do you study protein X?

      Here we look at biophysics from a biologist's point of view; asking why is biophysics important? What can biophysics contribute to biology? And what biophysics techniques should biologists know about, and use?

    • Application note31 October 2018

      Protein Size as an Indication of Structure

      Molecular weight (Mw) is a commonly used, and for many scientists a readily understood, parameter to describe the size of a protein or complex. Here we show how hydrodynamic radius (Rh) can be used in combination with Mw to provide insights into the shape and structure of proteins and illustrate how Mw alone may not always provide a complete picture.