- Numerical analysis of solitons profiles in a composite model for DNA to rsion dynamics
- 3Study on a vehicle dynamics model for improving roll stability
- A parametric analytical model for non-linear dynamics in cable-stayed beam
- Fracture analysis for torsion problems of a composite cylinder with curvilinear cracks
- A LINEAR ACOUSTIC MODEL FOR INTAKE WAVE DYNAMICS IN I.C. ENGINES
- Solitons in a double pendulums chain model, and DNA roto-torsional dynamics
- Quantum dynamics of a model for two Josephson-coupled Bose--Einstein condensates
- A Model for Hybrid Simulations of Molecular Dynamics and CFD
- A model for the torsion strength of a laser-welded stator
- Torsion angle dynamics for NMR structure calculation with the new program

A composite model for DNA torsion dynamics?

Mariano Cadoni?

Dipartimento di Fisica, Universit` di Cagliari and INFN, a Sezione di Cagliari, Cittadella Universitaria 09042 Monserrato, Italy

Roberto De Leo?

Dipartimento di Fisica, Universit` di Cagliari and INFN, a sezione di Cagliari, Cittadella Universitaria 09042 Monserrato, Italy

arXiv:q-bio/0604014v2 [q-bio.BM] 20 Apr 2006

Giuseppe Gaeta

Dipartimento di Matematica, Universit` di Milano, via Saldini 50, IC20133 Milano, Italy a DNA torsion dynamics is essential in the transcription process; a simple model for it, in reasonable agreement with experimental observations, has been proposed by Yakushevich (Y) and developed by several authors; in this, the DNA subunits made of a nucleoside and the attached nitrogen bases are described by a single degree of freedom. In this paper we propose and investigate, both analytically and numerically, a composite version of the Y model, in which the nucleoside and the base are described by separate degrees of freedom. The model proposed here contains as a particular case the Y model and shares with it many features and results, but represents an improvement from both the conceptual and the phenomenological point of view. It provides a more realistic description of DNA and possibly a justi?cation for the use of models which consider the DNA chain as uniform. It shows that the existence of solitons is a generic feature of the underlying nonlinear dynamics and is to a large extent independent of the detailed modelling of DNA. The model we consider supports solitonic solutions, qualitatively and quantitatively very similar to the Y solitons, in a fully realistic range of all the physical parameters characterizing the DNA.

I.

INTRODUCTION

The possibility that nonlinear excitations C in particular, kink solitons or breathers C in DNA chains play a functional role has attracted the attention of biophysicists as well as nonlinear scientists since the pioneering paper of Englander et al. [17], and the works by Davydov on solitons in biological systems [14]. A number of mechanical models of the DNA double chain have been proposed over the years, focusing on di?erent aspects of the DNA molecule and on di?erent biological, physical and chemical processes in which DNA is involved. Here we will not discuss these, but just refer the reader to the discussions of such attempts given in the book by Yakushevich [56] and in the review paper by Peyrard [39] (see also the conference [38]), also for what concerns earlier attempts which constituted the basis on which the models considered below were ?rst formulated.[63] Similarly, we will not describe the structure and functioning of DNA, but just refer e.g. to [9, 19, 46]. See also [38, 40] for the role of Nonlinear Dynamics modelling in the understanding of DNA, and [33, 49] for DNA single-molecule experiments (these were initiated about ?fteen years ago [48], but their range and precision has dramatically increased in recent years; the formation of bubbles in a double-stranded DNA has been observed in [1]). In recent years, two models have been extensively studied in the Nonlinear Physics literature; these are the model by Peyrard and Bishop [41] (and the extensions of this formulated by Dauxois [13] and later on by Barbi, Cocco, Peyrard and Ru?o [3, 4]; see also Cocco and Monasson [11]. More recent advances are discussed in [39] and [5, 12, 51]) and the one by Yakushevich [52]; we will refer to these as the PB and the Y models respectively. Original versions of these models are discussed in [26]; they are put in perspective within a hierarchy of DNA models in [57]. An attempt to blend together the two is given in [53]; see also [31]. Interplay between radial and torsional degrees of freedom is considered more organically in [3, 4]. The PB model is primarily concerned with DNA denaturation, and describes degrees of freedom related to straight (or radial) separation of the two helices which are wound together in the DNA double helical molecule. On the

Work supported in part by the Italian MIUR under the program COFIN2004, as part of the PRIN project Mathematical Models for DNA Dynamics (M 2 D 2 ). ? Electronic address: mariano.cadoni@ca.infn.it ? Electronic address: roberto.deleo@ca.infn.it Electronic address: gaeta@mat.unimi.it

?

2 other hand, the Y model C on which we focus in this note C is primarily concerned with rotational and torsional degrees of freedom of the DNA molecule, which play a central role in the process of DNA transcription [64]. In this model, one studies a system of nonlinear equations which in the continuum limit reduce to a double sineGordon type equation; the relevant nonlinear oscillations are kink solitons C which are solitons in both dynamical and topological sense C which describe the unwinding of the double helix in the transcription region. The latter is a bubble of about 20 bases, to which RNA Polymerase (RNAP) binds in order to read the base sequence and produce the RNA Messenger; the RNAP travels along the DNA double chain, and so does the unwound region. The proposal of Englander et al. [17] was that if the nonlinear excitations are not created or forced by the RNAP but are anyway present due to the nonlinear dynamics of the DNA double helix itself, a number of questions C in particular, concerning energy ?ows C receive a simple explanation. The Y model has been studied in a number of paper, in particular for what concerns its solitonic solutions; here we will quote in particular [20, 21, 29, 53, 54, 57]. It has been shown that it gives a correct prediction of quantities related to small amplitude dynamics, such as the frequency of small torsional oscillations; as well as of quantities related to fully nonlinear dynamics, such as the size of solitonic excitations describing transcription bubbles [26, 56]. Moreover, in its helicoidal version, it provides a scenario for the formation of nonlinear excitation out of linear normal modes lying at the bottom of the dispersion relation branches [26]. On the other hand, if we try to ?t the observed speed of waves along the chain [56], this is possible only upon assuming unphysical values for the coupling constants [57]. The Y model is also a very simple one, and adopts the same kind of simpli?cation as in the PB model. In particular, two quite strong features of the models are: ? (a) there is a single (angular) degree of freedom for each nucleotide; ? (b) all bases are considered as identical. These were in a sense at the basis of the success of the model, in that thanks to these features the model can be solved exactly and one can check that predictions allowed by the model correspond to the real world situations for certain speci?c quantities. But the features mentioned above are of course not in agreement with the real situation. Indeed, it is well known that bases are quite di?erent from each other, and in particular purines are much bigger than pirimidines; hence feature (b) C albeit necessary for an analytical treatment of the model C is de?nitely unrealistic. Moreover, it is quite justi?ed to consider several groups of atoms within a single nucleotide (the phosphodiester chain, the sugar ring, and the nitrogen base) as substantially rigid subunits; but these C in particular the sugar ring C have some degree of ?exibility, and whats more they have a considerable freedom of displacement C in particular for what concerns torsional and rotational movements C with respect to each other. Thus, even in a simple modelling, feature (a) is not justi?ed per se, and it seems quite appropriate to consider several subunits within each nucleotide. In this sense, we will speak of a composite Yakushevich (Y) model. Needless to say, if such a more detailed modelling would produce results very near to those of the simple Y model, this should be seen as a con?rmation that the latter correctly captures the relevant features of DNA torsional dynamics C hence justify a posteriori feature (a) of the Y model. In this work we propose and study a composite Y model (in the sense mentioned above), in which we describe with two independent angular degrees of freedom, the nucleoside (i.e. the segment of the sugar-phosphate backbone pertaining to the nucleotide) and the nitrogen base in each nucleotide. The purpose of our study of such a model is to shed light on the following points. ? (A) We want to take care (to some extent) of correcting feature (a) above and thus investigating C by comparing results C how justi?ed is the original Y modelling in terms of one degree of freedom per nucleotide. ? (B) We aim at opening the way to correct C or justify C feature (b) of the Y model. Indeed, in our model we will consider separately the part of the nucleotide which precisely replicates identical in each nucleotide (the unit of the sugar-phosphate backbone), and the part which varies from one nucleotide to the other (the bases). We will then be able to study the di?erent role of the two. ? (C) We want to check the dependence of the solitonic solution of the model both from the geometry and from the value of the physical parameters chosen. In particular, we would like to understand how far the existence of solitons is a generic feature of DNA and if a more realistic choice of the model geometry is consistent with phenomenologically acceptable values of the physical parameters. It will turn out that the Y model, which can be considered as a particular case of our model, captures the essential features of DNA nonlinear dynamics. The more realistic geometry of the model we use in this paper enables a drastic improvement of the descriptive power of our model at both the conceptual and the phenomenological level: on the one hand the composite Y model keeps almost all the relevant features of the Y model, but on the other hand it allows for more realistic choice of the physical parameters.

3 It will turn out that the di?erent degrees of freedom we use play a fundamentally di?erent role in the description of DNA nonlinear dynamics. The backbone degrees of freedom are topological and play to some extent a master role, while those associated to the base are nontopological and are (in ?uid dynamics language) slaved to former ones. This opens an interesting possibility, i.e. to consider a more realistic model, in which di?erences among bases are properly considered, as a perturbation of our idealized uniform model. As the essential features of the fully nonlinear dynamics are related only to backbone degrees of freedom, we expect that such a perturbation C albeit with relevant di?erence in the quantitative values of some parameters entering in the model (the base dynamical and geometrical parameters) C will show the same kind of nonlinear dynamics as our uniform model studied here. The paper is organized as follows. In Sect. II we will brie?y review some basic know facts about the DNA structure and modelling. In Sects. III and IV we will set up our model, describe the interaction and write down the equations of motions that govern its dynamics. The physical parameters characterizing our model are discussed in Sect. V. In Sect. VI we discuss the linear approximation of our dynamical system, in particular its dispersion relations. In Sect. VII we will set up the framework for the investigation of the nonlinear dynamics and the topological excitations of our model. In sect. VIII, we will show how the Y model and Y solitons emerge as a particular case of our composite Y model and its solitons. Sect. IX we investigate and derive numerically the solitonic solutions of our model. Finally in Sect. X we summarize our work and present our conclusions.

II. DNA STRUCTURE AND MODELLING

DNA is a gigantic polymer, made of two helices wound together; the helices have a directionality and the two helices making a DNA molecule run in opposite direction. We will refer for de?niteness to the standard conformation of the molecule (B-DNA); in this the pitch of the helix corresponds to ten base pairs, and the distance along the axis of the helix between successive base pairs is = 3.4 ?. A The general structure of each helix can be described as follows. The helix is made of a sugar-phosphate backbone, to which bases are attached. The backbone has a regular structure consisting of repeated identical units (nucleosides); bases are attached to a speci?c site on each nucleoside and are of four possible types. These are either purines, which are adenine (A) or guanine (G), or pyrimidines, which are cytosine (C) or thymin (T) in DNA. It should be noted that the bases are rather rigid structures, and have an essentially planar con?guration. A unit of each helix is called a nucleotide; this is the complex of a nucleoside and the attached base. The winding together of the two helices makes that to each base site on the one helix corresponds a base site on the other helix. Bases at corresponding sites form a base pair; each base has only a possible partner in a base pair; bases in a pair are linked together via hydrogen bonds (two for A-T pairs, three for G-C pairs). The base pairs can be opened quite easily, the dissociation energy for each H-bond being of the order of 0.04 eV, hence ?E ? 0.1 eV per base pair. Opening is instrumental to a number of processes undergone by DNA, among which notably transcription, denaturation and replication. The backbone structure (see Fig.1) is made of a phosphodiester chain and a sugar ring. To one of the C atoms of the sugar ring is attached a base; this is one of the four possible bases A, C, G, T , whose sequence represents the information content of DNA and is di?erent for di?erent species, and to some extent for each individual. Thus, each helix is made of a succession of identical nucleosides, and attached bases which can be di?erent at each site. Bases at corresponding sites on the two helices form a base pair, and these can be only of two types, G-C and A-T. The atoms on each helix are of course held together by covalent bonds; apart from these, other interactions should be taken into account when attempting a description of the DNA molecule. The backbone structure has some rigidity; in particular, it would resist movements which represent a torsion of one nucleoside with respect to neighboring ones. We will refer to the interaction responsible for the forces resisting these torsion as torsional interactions. As already mentioned, the two bases composing a base pair are linked together by hydrogen bonds; we will refer to the interaction mediated by these as pairing interaction. Each base interacts with bases at neighboring sites on the same chain via electrostatic forces (bases are strongly polar); these make energetically favorable the conformation in which bases are stacked on top of each other, and therefore are referred to as stacking interaction. Finally, water ?laments C thus, essentially, bridges of hydrogen bonds C link units at di?erent sites; these are also known as Bernal-Fowler ?laments [14]. In particular, they have a good probability to form between nucleosides or bases which are half-turn of the helix apart on di?erent chains, i.e. which are near to each other in space due to the double helix geometry; these water ?laments-mediated interactions are therefore also called helicoidal interactions [65]. We stress that these are quite weaker than other interactions, and can be safely overlooked when we consider the fully nonlinear regime. They are instead of special interest when discussing small amplitude (low energy) dynamics, as C just because of their weakness C they are easily excited and introduce a length scale in the dispersion relations (see below). If we consider large amplitude deviations from the equilibrium con?gurations, then motions will not be completely

4

O

O

P

OH

O5 H C5 H O4

A,C,G,T

C4 H

H C3

H C2 H

C1 H

O3

O

P

OH

1 FIG. 1: The structure of a DNA helix. A nucleotide is shown; nitrogen bases are attached to the C1 atom in the sugar ring.

free: the molecule is densely packed, and the presence of the sugar-phosphate backbone C and of neighboring bases C will cause steric hindrances to the base movements. In particular, for the rotations in a plane perpendicular to the double helix axis, the bases will not be able to rotate around the C1 atom for more than a maximum angle ?0 without colliding with the nucleoside. This will lead of course to complex behaviors as the DNA helix gets unwound; in particular, as ? gets near to its limit value ?0 we expect some kind of essentially (if not mathematically) discontinuous behavior. This should not be seen as shortcoming of the model: it is indeed well known that bases rotate in a complex way while ?ipping about the DNA axis (see e.g. [2]). Finally, we mention that here we consider a DNA molecule without taking into account its macro-conformational features; that is, we consider an ideal molecule, disregarding supercoiling, organization in istones, and all that [9].

III. COMPOSITE Y MODEL

As mentioned above, we will model the molecule as made of di?erent parts (units), each of them behaving as a single element, i.e. as a rigid body. We consider each nucleoside N as a unit, to which a base B (considered again as a single unit) is attached.

5

C dh r B R A 1 2R + 2dh + 0 O B ?1

C dh ?2 r R 2 A

FIG. 2: A base pair in our model. The origin of the coordinate system is in O. The angles 1 between the lines AO and AB and 2 between AO and AB correspond to torsion of sugar-phosphate backbone with respect to the equilibrium B-DNA conformation; the angles ?1 between the line AB and the line BC, and ?2 between AB and BC correspond to rotation of bases around the C1 ? N bond linking them to the nucleotide. All angles are in counterclockwise direction; thus the angles 2 and ?2 in the ?gure are negative.

A.

General features

We will hence model each of the helices in the DNA double chain as an array of elements (nucleotides) made of two subunits; one of these subunits model the nucleoside, the other the nitrogen base. We will consider the bases as all equal, thus disregarding the substantial di?erence between them [66]. The chains C and thus the arrays C will be considered as in?nite. We will use a superscript a = 1, 2 to distinguish elements on the two chains, and a subscript i Z to identify (2) (1) the site on the chains. Thus the base pairing will be between bases Bi and Bi , while stacking interaction will be (a) (a) (a) between base Bi and bases Bi+1 and Bi?1 . We will consider each nucleoside Ni as a disk; bases will be seen as disks themselves, with a point on the border (a) (a) of Bi attached via an inextensible rod to a point pc on the border of Ni ; these points on B and N represent the locations of the N atom on B and of the C1 atom on N involved in the chemical bond attaching the base to the nucleoside. The rod can rotate by an angle ?0 before B collides with N; on the other hand, the disk N can rotate completely around its axis. We also single out a point ph on the border of the disk modelling the base; this represents the atom(s) which form the H bond with the corresponding base on the other DNA chain. The disks (i.e. the elements of our model) are subject to di?erent kinds of forces, corresponding to those described (a) (a) (a) above: torsional forces resisting the rotation of one disk Ni with respect to neighboring disks Ni+1 and Ni?1 on the same chain; stacking forces between a base Bi

(1) Bi (2) Bi (a) (a)

and neighboring bases Bi+1 and Bi?1 on the same chain; pairing

(a)

(a)

in the same base pair; and ?nally, helicoidal forces correspond to hydrogen bonded and forces between bases (1) (2) (2) (1) Bernal-Fowler ?laments linking bases Bi and Bi5 (and Bi and Bi5 ).

B. The Lagrangian

We should now translate the above discussion into a Lagrangian de?ning our model. This will be written as L = T ? (Ut + Us + Up + Uh ) where T is the kinetic energy, and Ua are the potential energies for the di?erent interactions listed above, i.e. ? Ut is the backbone torsional potential, ? Us is the stacking potential, ? Up is the pairing potential, ? Uh is the helicoidal potential.

1

(III.1)

6 These will be modelled by two-body potentials, for which we use the notation Va , to be summed over all interacting pairs in order to produce the Ua . We denote by I the moment of inertia (around center of mass) of disks modelling the nucleosides, and by IB the moment of inertia of bases around the C1 atom in the sugar ring; as the bases can not rotate around their center of mass, this is equal to mr2 , where m is the base mass and r is the distance between the C1 atom in the sugar and the center of mass of the base.

C. The degrees of freedom

We are primarily interested in the torsional dynamics. Thus, for each element we will consider torsional movements, hence a rotation angle (with respect to the equilibrium conformation, which we take for de?niteness to be B-DNA); (a) (a) (a) (a) these will be denoted as i for the nucleoside Ni , and ?i for the base Bi . Only these rotations will be allowed in our model. All angles will be positive in counterclockwise sense. The angles represent a torsion of the sugar-phosphate backbone with respect to the equilibrium con?guration; thus they are related to unwinding of the double helix. On the other hand, the angles ? represent a rotation of the base with respect to the corresponding nucleoside; the motion described by ? can be thought of as a rotation around the C1 atom in the sugar ring. Note that the hindrances due to the presence of backbone atoms constrain rotation of the base around the C1 atom. Thus, as mentioned above, the angles will have a di?erent range of values: i

(a)

R , ?i

(a)

[??0 , ?0 ] ?0 < .

(III.2)

The actual value of ?0 is not essential. The important feature is that the base can not rotate freely around the C1 atom, but only pivot between certain limits. At the level of the numerical analysis the simplest way to implement the previous boundary condition is to use a con?ning potential, which reproduces approximately the form of a box. For this reason in Sect. IX we will add to the Hamiltonian of the system a con?ning potential Vw = K tan4 ?(a) . It should be stressed that C just on the basis of these di?erent ranges of variations C there will be a substantial di?erence between the degrees of freedom described by the angles: those described by angles will be topological degrees of freedom, while those described by ? angles will only describe local and (relatively) small motions C hence ? describe non topological degrees of freedom.

D. Cartesian coordinates

In computing the kinetic energy, it will be convenient to consider cartesian coordinates. With reference to Fig. 2, the cartesian coordinates in the (x, y) plane orthogonal to the double helix axis of relevant points will be as follow. (a) (a) The center of disks, representing the position of the phosphodiester chain, will be (xo , yo ); the point on the (a) (a) border of the disks representing the C1 atom to which the base disks are attached will be (xc , yc ). The center (a) (a) of mass of the bases will be (xb , yb ), and the point on the border of the disks modelling bases representing the (a) (a) atom(s) forming the H bonds will be (xh , yh ). In terms of the {, ?} angles, these are given by (we omit the site index i for ease of writing, and give condensed formulas for the two chains, with ?rst sign referring to chain 1): xo (1,2) xc (1,2) xb (1,2) xh

(1,2)

= ?a, (1,2) = xo R cos((1,2) ), (1,2) r cos((1,2) + ?(1,2) ), = xc (1,2) = xc dh cos((1,2) + ?(1,2) ),

yo (1,2) yc (1,2) yb (1,2) yh

(1,2)

= 0; = R sin((1,2) ); (1,2) r sin((1,2) + ?(1,2) ); = yc (1,2) = yc dh sin((1,2) + ?(1,2) ).

(III.3)

Here and in the following we denote by R the radius of disks describing nucleosides, i.e. the length of the segments AB and AB in Fig. 2 (this is the distance from the phosphodiester chain to the C1 atom); by r the distance between the center of mass of bases and the border of the disk modelling the nucleoside (i.e. the C1 atom). We also denote by dh the lengths (supposed equal) of the segments BC and BC joining the C1 atom on the nucleoside and the atoms of the bases forming the hydrogen bond linking this to the complementary base. The parameter a corresponds to the distance between the double helix axis and the phosphodiester chain, whereas 0 is the distance between points C and C in the equilibrium con?guration. The previous parameters are obviously related by the equation 2a = 2R+2dh +0 .

7

E. Kinetic energy

With this notation, and standard computations, the kinetic energy of each nucleotide is written as Ti

(a)

=

1 BB B mr2 ?2 + 2mr(r + R cos ?) ? + I + mB (R2 + r2 ) + 2mRr cos ? 2 , B 2

where we have suppressed super- and sub-scripts for ease of reading. Thus, the total kinetic energy for the double chain is T = =

1 2 a a i

Ti

i

(a)

=

(a) 2

B m r2 ?i

2 2

B + 2 m r (r + R cos(?i )) i B(a) i

2

(a)

(a)

?i B

(a)

+

(III.4)

+

I + mB (R + r ) +

(a) 2mRr cos(?i )

.

We have thus considered a general class of composite Y models; so far we have not speci?ed the interaction potentials, which are needed to have a de?nite model.

F. Modelling the interactions

We have now to specify our model by ?xing analytical expressions for terms modelling potential interactions in the lagrangian (III.1). As one of our aims is to compare the results obtained by a composite Y model with those obtained with the simple Y model, we will make choices with the same physical content as those made by Yakushevich.

1. Torsional interactions

Torsional forces will depend only on di?erence of angles (measured with respect to the equilibrium B-DNA con?guration) of neighboring units on the same phosphodiester chain; thus we introduce a torsion potential Vt and have Ut =

a i

Vt i+1 ? i

(a)

(a)

.

(III.5)

The potential Vt must have a minimum in zero, and be 2-periodic in order to take into account the fundamentally discrete and quantum nature of the phosphodiester chain. Here we will take the simplest such function [67], i.e. (adding an inessential constant so that the minimum corresponds to zero energy) Vt (x) = Kt (1 ? cos(x)) where Kt is a dimensional constant. Thus, our choice for torsional interactions will be Ut = K t

a i

(III.6)

1 ? cos i+1 ? i

(a)

(a)

.

(III.7)

The harmonic approximation for this is of course Utq = 1 Kt 2 i+1 ? i

a i (a) (a) 2

.

2.

Stacking interactions

Stacking between bases will only depend on the relative displacement of neighboring bases on the same helix in the plane orthogonal to the double helix axis [68]. That is, introducing a stacking potential Vs we have Us =

a i

Vs i

(a)

,

(III.8)

8 where i

(a) (a) (a)

:=

(xi+1 ? xi )2 + (yi+1 ? yi )2 ,

(a)

(a)

(a)

(a)

(III.9)

where xi , yi are the coordinates of the center of mass of the bases. The simplest choice corresponds to a harmonic potential [69], Vs = (1/2)Ks 2 . This will be our choice C which again corresponds to the one made in the PB and in the Y models, so that Us =

a i

Ks (a) 2 (i ) . 2

(III.10)

We should however express this in terms of the and ? angles. With standard algebra, using Eqs. (III.3), we obtain Us =

2 2 Ks a i 2 R +r + (a) (a) (a) (a) (a) (a) ? R2 cos(i+1 ? i ) ? r2 cos[(i+1 ? i ) + (?i+1 ? ?i )] + 1 2

? Rr cos[(i+1 ? i ) + ?i+1 ] + cos[(i+1 ? i ) ? ?i ] + Rr cos(?i+1 ) + cos(?i )

(a) (a)

(a)

(a)

(a)

(a)

(a)

(a)

+

(III.11)

.

3.

Pairing interactions

Pairing interactions are due to stretching of the hydrogen bonds linking bases in a pair. Introducing a paring potential Vp which models the H bonds, we have Up =

i

Vp (i , i , ?i , ?i ) .

(1)

(2)

(1)

(2)

(III.12)

We note that H bonds are strongly directional, so that they are quickly disrupted once the alignment between pairing bases is disrupted. This feature is traditionally disregarded in the Y model, where it is assumed that Vp only depends on the distance i := between the interacting bases; that is, Up =

i

(xi

(1)

? xi )2 + (yi

(2)

(1)

? yi )2

(2)

(III.13)

Vp (i ) .

(III.14)

As noted by Gonzalez and Martin-Landrove [29] in the context of the Yakushevich model, one should be careful in expanding a potential Vp () in terms of the rotation angles ? and : indeed, unless 0 = 0, i.e. a = R + dh , one would get zero quadratic term in such an expansion (see however [24] for what concerns solitons in this context). As for the potential Vp , there are two simple choices for this appearing in the literature. On the one hand, Yakushevich [52] suggests to consider a potential harmonic in the intrapair distance (this would appear nonlinear when expressed through rotation angles) and this has been kept in subsequent discussions and extensions of her model [56]; on the other hand, Peyrard and Bishop [41] consider a Morse potential; again this has been kept in subsequent discussions and extensions of their model [39]. There is no doubt that the Morse potential is more justi?ed in physical terms; however, as we wish to compare our results with those of the original Y model, we will at ?rst consider a harmonic potential Vp(Y ) () = 1 Kp ( ? 0 )2 , 2 (III.15)

where 0 is the intrapair distance in the equilibrium con?guration. Moreover, again in order to compare our results with those of the original Y model, we will later on set 0 = 0. This corresponds to setting a = R + dh . These approximations can appear very crude, but experience gained (as preliminary work for the present investigation) with the standard Yakushevich model [24, 25] suggests they do not have a great impact at the level of fully nonlinear dynamics.

9 We should express Vp in terms of the rotations angles. Using once again the expressions (III.3), we have with standard computations that 2 := i xi

(1)

? xi

(2)

2

+

yi

(1)

? yi

(2) (2)

(2) (1)

2

=

(2) (1) (1)

= 2 2a2 + R2 + d2 + R2 cos(i h + Rdh cos ?i

(1)

? i ) + d2 cos[(i h

(1) (2) (1)

? i ) + (?i

(1) (2)

(2)

(1)

? ?i )+

(2) (2) (2)

(2)

+ cos ?i

+ cos[(i

? i ) + ?i ] + cos[(i

(1)

? i ) ? ?i ] + + i ) .

(III.16)

?2aR cos(i ) + cos(i ) ? 2adh cos(?i

(1)

+ i ) + cos(?i

With this, our choice for the pairing part of the hamiltonian will be Up =

i

Vp (i ) .

(III.17)

4.

Helicoidal interactions

Helicoidal interaction are mediated by water ?laments (Bernal-Fowler ?laments [14]) connecting di?erent nucleotides; in particular we will consider those being on opposite helices at half-pitch distance, as they are near enough in three-dimensional space due to the double helical geometry. As the nucleotide move, the hydrogen bonds in these ?laments C and those connecting the ?laments to the nucleotides C are stretched and thus resist di?erential motions of the two connected nucleotides. We will, for the sake of simplicity and also in view of the small energies involved, only consider ?laments forming between nucleosides; thus only the angles will be involved in these interactions. We have therefore, introducing a helicoidal potential Vh and recalling that the pitch of the helix corresponds to 10 bases in the B-DNA equilibrium con?guration, Uh =

i

Vh (i+5 ? i ) + Vh (i+5 ? i ) .

(1)

(2)

(2)

(1)

(III.18)

As the angles are involved, the potential Vh should be 2-periodic [70]. Such water ?lament connections involve a large number (around 10) of hydrogen bonds; hence each of them is only slightly stretched, and it makes sense to consider the angular-harmonic approximation Vh ( ) = Kh [1 ? cos( )] ? Our choice will therefore be Uh = K h

i

1 Kh 2 . 2

(III.19)

2 ? cos(i+5 ? i ) ? cos(i+5 ? i ) .

(1)

(2)

(2)

(1)

(III.20)

IV.

EQUATIONS OF MOTION

In the previous sections we have set up the model and the interactions. Let us now study its dynamics. We denote collectively the variables as a , e.g. with = (?(1) , ?(2) , (1) , (2) ). The dynamics of the model will be described by the Euler-Lagrange equations corresponding to the Lagrangian (III.1) with the terms in the interaction potential given respectively by Eqs. (III.7), (III.11), (III.17), (III.20) d ?L ?L = 0. a ? dt Ba ?i ? i (IV.1)

10 With our choices for the di?erent terms of L, and writing a for the complementary chain of the chain a (that is, ? = 2, ? 1 ? = 1), these read 2

(a) B (a) (a) (a) (a) mr2 ?i + mr[R cos(?i ) + r]i + mrR sin(?i )(i )2 = (a) (a) (a) (a) (a) (a) = Ks r2 sin[?i?1 ? ?i + i?1 ? i ] ? 2adh Kp sin(?i + i ) (a) (a) (a) (a) (a) (a) ?Ks rR sin(?i ? i?1 + i ) ? Ks rR sin(?i + i ? i+1 ) (? ) a (a) (a) (a) (a) (a) (a) ?Ks r2 sin(?i ? ?i+1 + i ? i+1 ) + dh Kp R sin(?i + i ? i )+ (a) (? ) a (a) (? ) a (a) d2 Kp sin(?i ? ?i + i ? i ) + R(dh Kp + 2Ks r) sin[?i ] ; h (a) (a) (a) (a) (a) (a) (a) mrR cos(?i )(?i + 2i ) + mr2 ?i + I i + mr2 i + mR2 i (a) (a) (a) (a) B B B ?mrR sin(?i )?i (?i + 2i ) = (a) (a) (a) (a) (a) 2 = (Kt + Ks R ) sin(i?1 ? i ) + Ks rR sin(?i?1 ? (i ? i?1 )) (a) (a) (a) (a) (a) ?Ks r2 sin((?i ? ?i?1 ) + (i ? i?1 )) ? 2aKp R sin(i )? (a) (a) (a) (a) (a) 2adh Kp sin(?i + i ) ? Ks rR sin[?i + (i ? i?1 )] (a) (a) (a) (a) (a) ?Ks rR sin(?i ? (i+1 ? i )) + (Kt + Ks R2 ) sin(i+1 ? i ) (a) (a) (a) (a) (a) (a) (a) +Ks r2 sin[(?i+1 ? ?i ) + (i+1 ? i )] + Ks rR sin[?i+1 + (i+1 ? i )]+ (? ) a (a) (a) (? ) a (a) Kp R2 sin(i ? i ) + dh Kp R sin(?i + (i ? i )) + (? ) a (a) (? ) a (? ) a (a) (? ) a (a) d2 Kp sin((?i ? ?i ) + (i ? i )) ? dh Kp R sin(?i ? (i ? i )) h (? ) a (a) (? ) a +Kh i+5 ? 2i + i?5

(IV.2)

Note that here a, R, r, dh are considered as independent parameters, i.e. we have not enforced the Yakushevich condition R + dh = a (i.e. 0 = 0). Needless to say, these are far too complex to be analyzed directly, and we will need to introduce various kinds of approximation. We have thus completely speci?ed the model we are going to study and derived the equations that govern its dynamics, i.e. its lagrangian and the equations of motion. The choice of torsion angles as variables to describe our dynamics led to involved expressions, but our choices are very simple physically. We have considered angular harmonic approximations (expansion up to ?rst Fourier mode) i.e. potentials of the form V (x) = [1 ? cos(x)] for the torsion and helicoidal interactions, harmonic approximation for the base stacking interaction, and a harmonic potential depending on the intrapair distance for the pairing interaction. Our approximations are coherent with those considered in the literature when dealing with uniform models of the DNA chain, and in particular when dealing with (extensions of) the Yakushevich model. Thus, when comparing the characteristic of our model with those of these other models, we are really focusing on the di?erences arising from considering separately the nucleoside and the base within each nucleotide. It would of course be possible to consider more realistic expressions for the potentials; but we believe that at the present stage this would rather obscure the relevant point here, i.e. the discussion of how such composite models can retain the remarkable good features of the Y model and at the same time overcome some of the di?culties they encounter. Finally, we note that it is quite obvious that the dynamical equations describing the model are C despite the simplifying assumptions we made at various stage C too hard to have any hope to obtain a general solution, either in the discrete or in the continuum version (see below) of the model. In the next section we will focus our attention on the choice of the physical parameters appearing in our model. Later, we will investigate the dynamics beginning with the linear approximation and then in the fully nonlinear regime.

V. PHYSICAL VALUES OF PARAMETERS

In order to have a well de?ned model we should still assign concrete values to the parameters C both geometrical ones and coupling constants C appearing in our Lagrangian (III.1) and in the equation of motion (IV.2).

A. Kinematical parameters

Let us start by discussing kinematical parameters; in these we include the geometrical parameters as well as the mass m and the moment of inertia I. The masses can be readily evaluated by considering the chemical structure of the bases. They can be calculated just by knowing masses of the atoms and their multiplicity in the di?erent bases. As for the geometrical parameters

11 like R, a, r and dh (and the moment of inertia I), quite surprisingly di?erent authors seem to provide di?erent values for these. Rather than assuming the values given by one or another author, we have preferred to estimate the parameters using the available information about the DNA structure. Position of atoms within the bases (which of course determine R, a, r and dh , and hence I) and geometrical descriptions of DNA are widely available to the scienti?c community in form of PDB ?les [36]. We will use this information (which we accessed at [16] and [37]) to estimate directly all static parameters in play on the basis of the atomic positions. The geometrical parameters which are relevant for our discussion are the longitudinal width of bases lb and of the sugar ls , the distances of the bases from the relative sugars ds and the distance of a base from the relative dual base db . We give our estimates for the masses, moments of inertia and the parameters l, ds , db for the di?erent bases and ? ? their mean values in Table I. ?From those data and using the equations R = ls , r = ds + ?b /2, dh = ?b + ds , a = l l ? ? ls + ?b + ds + db /2 (hats denote mean values), one obtains the average values for the geometrical parameters appearing l in our Lagrangian. given in table II.

A T G C mean Sugar m 134 125 150 110 130 85 3 3 3 3 3 I 3.6 10 3.0 10 4.4 10 2.3 10 3.3 10 1.2 102 l 3.2 4.0 5.0 2.4 4.7 3.3 ds 1.5 1.5 1.5 1.5 1.5 db 2.0 2.0 2.0 2.0 2.0 TABLE I: Order of magnitude for the basic geometrical parameters of the DNA. Units of measure are: atomic unit for masses m, 1.67 10?47 Kg m2 for the inertia momenta I, Angstrom for l, ds and db , respectively the longitudinal width of bases and their distances from the relative sugars and from the relative dual base. These values have been extracted from the sample generic B-DNA PDB data [37], kindly provided by the Glactone Project [27], and double checked with the data from [16], that agree within 5%. Inertia momenta of bases has been evaluated with respect to rotations about the DNAs symmetry axis passing through the sugars C1 atom the base is attached to; the inertia momentum of the sugar itself has been evaluated with respect to rotations about its C3 ? C4 axis (see Fig. 1)

R r dh a 3.3 ? 3.8 ? 6.2 ? 10.5 ? A A A A TABLE II: Numerical values of the geometrical parameters chracterizing our model

B.

Coupling constants

The determination of the four coupling constants appearing in our model is more problematic, due partly to the di?culties in making experiments to test single coupling constants and partly to the complexity of the system itself.

1. Pairing

The coupling constant Kp , which appears in the pairing potential (III.17) can be easily determine by considering the typical energy of hydrogen bonds. The pairing interaction involves two (in the A ? T case) or three (in the G ? C case) electrostatic hydrogen bond. The pairing potential can be modelled with a Morse function Vp (x) = D(e?bd(x,x0 ) ? 1)2 = 1 (2Db2 )( ? 0 )2 + O(3 ) , 2 (V.1)

where D is the potential depth, the distance from the equilibrium position 0 and b a parameter that de?nes the width of the well. Although throughout this paper we use the harmonic potential (III.17) to model the pairing interaction, the use of the Morse function seems more appropriate for evaluating the parameter Kp . The point is that the pairing coupling constant is physically determined by the behavior of the pairing potential away from its minimum. Using the harmonic approximation (III.17) for estimate Kp would result in a completely unphysical value for the parameter.

12 Di?erent estimates of the parameters appearing in the potential (V.1) are present in the literature. The estimates DAT = 0.030eV, DGC = 0.045eV, bAT = 1.9? A are given in [10] and used in [62]. The values D = 0.040eV, b = 4.45? A are given in [42] and used in [3, 15, 42]. Finally, the estimates

?1 DAT = 0.050eV, DGC = 0.075eV, bAT = bGC = 4? A ?1 ?1

, bGC = 2.5? A

?1

are given in [8] and used in [8, 32]. The values of coupling constants corresponding to these di?erent values for the parameters appearing in the Morse potential range across a whole order of magnitude: 3.5 N/m Kp := 2b2 D 38N/m . (V.2)

In our numerical investigations we will use a value of Kp near to the lower bound given in V.2; that is, we adopt the value Kp = 4N/m, leading to an optical frequency of 0 = 2Kp /m = 36cm?1 , so to be in agreement with [43].

2.

Stacking

The determination of the torsion and stacking coupling constants is more involved and rests on a smaller amount of experimental data. The main information is the total torsional rigidity of the DNA chain C = S, where = 3.4? A is the base-pair spacing and S is the torsional rigidity. It is known [6, 7] that 10?28 Jm C 4 10?28 Jm . (V.3)

This information is used e.g. in [17, 61], whose estimate is based on the evaluation of the free energy of superhelical winding; this ?xes the range for the total torsional energy to be 180 KJ/mol S 720 KJ/mol . (V.4)

In our composite model the total torsional energy of the DNA chain has to be considered as the sum of two parts, the base stacking energy and the torsional energy of the sugar-phosphate backbone. In order to extract the stacking coupling constant we use the further information that ? stacking bonds amount at the most to 50KJ/mol [30]. Assuming a quadratic stacking potential, as we do, and a width of the potential well of about 2? we obtain the A estimate Ks = 68N/m. The phonon speed induced by this is c1 = Ks /m ? 6Km/s, see eq. (VI.12); this is rather close to the the estimate of 1.8Km/s c1 3.5Km/s given in [57]. As we shall see in detail in Sect. IX, choosing smaller values for Ks would have non-trivial consequences since solitons with small topological numbers become unstable in the discrete setting when the ratios Ks /Kp and Kt /Kp get small enough (see sect. IX). In particular, this value for Ks C together with the Kt below C is barely enough to allow the existence of solitons, as discussed later on this paper.

3. Torsion and helicoidal couplings

After extracting the stacking component, our estimate for the torsional coupling constant Kt is in the range 130 KJ/mol Kt 670 KJ/mol . (V.5)

Assuming (see below) that Kh ? Kt /25, so that c4 = 2Kt /Is (see eqs. (VI.11) and (VI.12)), all of these values for Kt induce phonon speeds slightly higher with respect to the estimates cited above, between 5 Km/s and 11 Km/s. For our numerical investigations, to keep the phonon speed as low as possible, we will set Kt = 130KJ/mol. Finally, for the helicoidal coupling constant, following [23], we assume that Kt and Kh di?er by about a factor 25, so that Kh = 5KJ/mol.

13

4. Discussion

It is interesting to point out how the geometry of the model nicely ?ts with the estimates of the binding energies so to induce optical frequencies and phonon speeds of the right order of magnitude (see also the discussion in Sect. VI). This is not the case in simpler models, where in order to get the right phonon speed within a simple Y model one is obliged to assume for Kt the unphysical value Kt = 6000KJ/mol [57]. Our estimates, and hence our choices for the values of the coupling constants appearing in our model, are summarized in table III. We will use these values of the physical parameters of DNA in the next sections, when discussing both the linear approximation and the dispersion relations as well as the full nonlinear regime and the solitonic solutions.

Kt Ks Kp Kh 130 KJ/mol 68 N/m 3.5 N/m 5 KJ/mol TABLE III: Values of the coupling constants for our DNA model

VI.

SMALL AMPLITUDE EXCITATIONS AND DISPERSION RELATIONS

In this section we will investigate the dynamical behavior of our model for small excitations in the linear regime. We will enforce the Yakushevich condition R + dh = a in order to keep the calculations and their results as simple as possible (see also [24]). Linearizing the equation of motion (IV.2) around the equilibrium con?guration ?i we get using standard algebra,

(a) (a) mr2 ?i + m(rR + r2 )i = (a) (a) (a) (a) (a) (a) = Ks (rR + r2 )(i+1 ? 2i + i?1 ) + r2 (?i+1 ? 2?i + ?i?1 ) + (a)

= i

(a)

= ?i B

(a)

B(a) = i = 0,

(VI.1)

?Kp (a ? R)2 (?i

(a)

+ ?i ) + a(a ? R)(i

(? ) a

(a)

+ i ) ; (VI.2)

(a) (a)

(? ) a

(a) (a) m(rR + r2 )?i + (I + m(R + r)2 )i = (a) (a) (a) = Kt i+1 ? 2i + i?1 +

+Ks (r + R) (R + r)(i+1 ? 2i ?Kp (a2 ? aR)(?i +Kh (i+5 ? 2i

(? ) a (a) (a) (? ) a

(a)

(a)

+ i?1 ) + r(?i+1 ? 2?i

(a)

(a)

+ ?i?1 ) +

(a)

+ ?i ) + a2 (i

(? ) a

+ i ) +

(? ) a

+ i?5 ) .

We are mainly interested in the dispersion relations for the propagating waves, which are solution of the system (VI.2). To derive them it is convenient to introduce variables ?() and () de?ned as ? = ?i i

(1)

?i

(2)

, i = i

(1)

i

(2)

.

(VI.3)

Let us now Fourier transform our variables, i.e. set

? (t) = Fk exp[i(kn + t)] ; n (t) = G exp[i(kn + t)] . n k

(VI.4)

Here k is the spatial wave number, is the wave frequency, and is a parameter with dimension of length and set equal to the interpair distance ( = 3.4 ?), introduced so that k has dimension [L]?1 and the physical wavelength is A = 2/k. In this way, we should only consider k [?/, /]. Using (VI.3) and (VI.4) into (VI.2), we get a set of linear equations for (Fk , G ); each set of coe?cients with k indices (k, ) decouples from other wave number and frequency coe?cients, i.e. we have a set of four dimensional systems depending on the two continuous parameters k and . This is better rewritten in vector notation as M k = 0, (VI.5)

14 where k is the vector of components k =

+ ? Fk , Fk , G+ , G? k k

(VI.6)

and M is a four by four matrix which we omit to write explicitly. In order to simplify the calculations we will set to zero the radius of the disk modelling the base, i.e dh = r. As in our model the disk describing the base cannot rotate around its axis, this assumption does not modify the physical outcome of the calculations. The condition for the existence of a solution to (VI.5) is the vanishing of the determinant of M . By explicit computation the latter is written as the product of three terms, apart from a constant factor r2 , ||M || = r2 1 2 3 . The three factors being, 1 = ?2Ks + m 2 + 2Ks cos(k) , 2 = I 2 ? 2((Kh + Kt ) ? Kt cos[k] + Kh cos[5k]) , 3 = Imr2 4 ? 2 r2 Kp I + 2(Ks I + Kt m) sin2 (k/2)+ + 2 mKh sin2 (5k/2) 2 + 8r2 Kp + 2Ks sin2 (k/2) The equation ||M || = 0 has four solutions, given by

2 1 2 2 2 3 2 4

(VI.7)

Kh sin2 (5k/2) + Kt sin2 (k/2) .

(VI.8)

= = = =

4(Ks /m) sin2 (k/2) , 4(Kt /I) sin2 (k/2) + 2(Kh /I) [1 + cos(5k)] , 2(Kp /m) + 4(Ks /m) sin2 (k/2) , 4(Kt /I) sin2 (k/2) + 4(Kh /I) sin2 (5k/2) .

(VI.9)

Eqs. (VI.9) provide the dispersion relations for our model. Physically, the four dispersion relations correspond to the four oscillation modes of the system in the linear regime. The relation involving 1 describes relative oscillations of the two bases in the chain with respect to the neighboring bases. As 1 (k) 0 for k 0 there is no threshold for the generation of these phonon mode excitations. The relations involving 2 and 4 are associated with torsional oscillations of the backbone. In case of 2 there is a threshold for the generation of the excitation originating in the helicoidal interaction, whereas the second torsional mode 4 has no threshold and is thus also of acoustical type. The dispersion relation involving 3 describes relative oscillations of two bases in a pair. The threshold for the generation of the excitation is now determined by the pairing interaction. The dispersion relations (VI.9) are plotted as /(2c), where c is the speed of light (we use the, in the literature widespread, convention of measuring frequencies in 2c units) versus k?/2 in Fig. 3 for values of the physical parameters given in the tables I and III. The four dispersion relations take a simple form if we consider excitations with wavelength much bigger then the intrapair distance, i.e >> ; this corresponds to the 0 limit. We have then

2 2 ? c2 k 2 = q ,

(VI.10)

where c and q ( = 1 . . . 4) are, respectively, the velocity of propagation (in the limit k >> q ) and the excitation threshold. They are given by c1 c2 c3 c4 = = = = Ks /m, (Kt ? 25Kh)/I, Ks /m, (Kt + 25Kh)/I, q1 q2 q3 q4 = 0; = 2 Kh /I; = 2Kp /m; = 0.

(VI.11)

Using the values of the parameters given in the tables I and III we have c1 c2 c3 c4 = 6.1 Km/s, = 0 Km/s, = 6.1 Km/s, = 5.1 Km/s, q1 q2 q3 q4 =0; = 22 cm?1 ; = 36 cm?1 ; =0,

(VI.12)

where c2 = 0 comes from the fact that we are taking Kt ? 25Kh (see table III). This of course just means that c2 is at least an order of magnitude smaller than the other ci C and therefore negligible. Speeds can be converted to base per seconds by dividing each ci by = 3.4?; excitation thresholds can be converted in inverse of seconds by A multiplying each qi by 2c, where c is the speed of light.

15

175 150 125 100 75 50 25

-3

-2

-1

1

2

3

FIG. 3: Graph of the dispersion relations (VI.9) in the ?rst Brillouin zone. We plot /(2c) (c is the speed of light) as a function of k?/2. 1 , 2 , 3 , 4 are represented respectively by the thin continuous, thick continuous, thick dashed and thin dashed line. Units are cm?1 in the vertical axis and radiants in the horizontal axis.

VII.

NONLINEAR DYNAMICS AND TRAVELLING WAVES

After studying the small amplitude dynamics of our model, we should now investigate the fully nonlinear dynamics. We are in particular interested in soliton solutions, and on physical basis they should have C if the model has any relation with real DNA C a size of about twenty base pairs. This also means that such solutions vary smoothly on the length scale of the discrete chain, and we can pass to the continuum approximation. On the other hand, such a smooth variance assumption is not justi?ed on the length scales (?ve base pairs) involved in the helicoidal interaction, and one should introduce nonlocal operators in order to take into account helicoidal interactions in the continuum approximation [26]. Luckily, numerical experiments show that C at least in the case of the original Yakushevich model C soliton solutions are very little a?ected by the presence or of the helicoidal terms (as could also be expected by their intrinsical smallness, in a context where they cannot play a qualitative role as for small amplitude dynamics) [26]. Thus, we will from now on simply drop the helicoidal terms, i.e. set Kh = 0.

A. Continuum approximation and ?eld equations

The continuum description of the discrete model we are considering requires to introduce ?elds (a) (z, t), (a) (z, t) such that

(a) (a) (n, t) ?(a) , (a) (n, t) n . n

(VII.1)

The continuum approximation we wish to consider is the one where we take (x , t) (x, t) x (x, t) + (1/2) 2 xx (x, t) , (x , t) (x, t) x (x, t) + (1/2) 2 xx (x, t) . (VII.2)

Inserting (VII.2), and taking Kh = 0, into the Euler-Lagrange equations (IV.2), we obtain a set of nonlinear coupled PDEs for (a) (x, t) and (a) (x, t), depending on the parameter . Coherently with (VII.2), we expand these equations to second order in and drop higher order terms. The equations we obtain in this way are symmetric in the chain exchange. We will be able to decompose their solutions into a symmetric and an antisymmetric part under the same exchange. In view of the considerable complication of the system, it is convenient to deal directly with the equations for these symmetric and antisymmetric part and to enforce the Y contact approximation R + dh = a That is we will consider ?elds = (1) (2) , = (1) (2) , and discuss equations which result by setting two of these to zero. In the symmetric case, i.e. for (1) (x, t) = (2) (x, t) = (x, t) , (1) (x, t) = (2) (x, t) = (x, t) , (VII.4) (VII.3)

16 the resulting equations are mr2 tt + (mr2 + mRr cos )tt = ?2a(a ? R)Kp sin( + )+ ?R(2Kp(R ? a) + mr2 ) sin + t + 2 Ks r r(xx + xx ) + Rxx cos + R2 sin ; x (mr2 + mRr cos )tt + (I + m(R2 + r2 + 2Rr cos ))tt = = ?2aKp (R sin + (a ? R) sin( + )) + mt Rr(t + 2t ) sin + + 2 Ks r2 xx + (Kt + Ks (R2 + r2 ))xx + +Ks Rr ((xx + 2xx) cos ? x (x + 2x ) sin )] . In the antisymmetric case (1) (x, t) = ?(2) (x, t) = (x, t) , (1) (x, t) = ?(2) (x, t) = (x, t) , the resulting equations are mr2 tt + (mr2 + mRr cos )tt = = Kp (a ? R) (R sin( + 2) + (a ? R) sin(2( + )) ? 2a sin( + )) + +R(Kp (a ? R) ? mr2 ) sin + t + 2 K ? sr r(xx + xx ) + R cos()xx + R sin()2 ; x (mr2 + mRr cos )tt + (I + m(R2 + r2 + 2Rr cos ))tt = = Kp ?2aR sin + R2 sin(2)+ +(a ? R)(?2a sin( + ) + (a ? R) sin(2( + )) + 2R sin( + 2))) + +mt rR(t + 2t ) sin + + 2 Ks r2 xx + Kt xx + Ks (R2 + r2 )xx + +Ks rR((xx + 2xx) cos ? x (x + 2x ) sin )] . (VII.6)

(VII.5)

(VII.7)

We will not write the equations in the cases of mixed symmetry, i.e. for (1) (x, t) = (2) (x, t) = (x, t), (1) (x, t) = ?(2) (x, t) = (x, t) and for (1) (x, t) = ?(2) (x, t) = (x, t) and (1) (x, t) = (2) (x, t) = (x, t).

B. Soliton equations

When studying DNA models, one is specially interested in travelling wave solutions, i.e. solutions depending only on z := x ? vt with ?xed speed v: (a) (x, t) = ?(a) (x ? vt) := ?(a) (z) , (a) (x, t) = ?(a) (x ? vt) := ?(a) (z) . (VII.8)

If we insert the ansatz (VII.8) into the equations (VII.5) and (VII.7), we get a set of four coupled second order ODEs; de?ning ? := (mv 2 ? Ks 2 ) , J := (Iv 2 ? Kt 2 ) , in the completely symmetric case (VII.5) we obtain ?r2 ? + ?r(r + R cos ?) ? = = ?2aKp (a ? R) sin (? + ?) + Ks 2 Rr sin (?) (? )2 + ?R sin(?) (?2Kp (a ? R) + mrv 2 (? )2 ) ; ?r(r + R cos ?) ? + [J + ?(R2 + r2 + 2Rr cos ?) ? = = ?2aKp (R sin ? + (a ? R) sin(? + ?)) + ?Rr sin(?)[(? )2 + 2? ? ] . (VII.9)

(VII.10)

17 In the completely antisymmetric case (VII.7), instead, we get ?r2 ? + ?r(r + R cos ?) ? = = ?2Kp (a ? R)(a ? R cos ? ? (a ? R) cos(? + ?)) sin(? + ?)+ ??Rr(sin ?)(? )2 ; ?r(r + R cos ?) ? + [J + ?(R2 + r2 + 2Rr cos ?) ? = = ?Kp 2aR sin ? ? R2 sin(2?)+ +(a ? R) (2a sin(? + ?) ? (a ? R) sin(2(? + ?)) ? 2R sin(? + 2?))] + +?rR(sin ?)[(? )2 + 2? ? ] .

(VII.11)

The previous equations appear too involved to be studied analytically at least in the general case. Numerical results are discussed in Sect. IX below. Some understanding can be gained at the analytical level by considering a particular case of the full equations (VII.10),(VII.11), when the system reduces essentially to the Y case. The next section is devoted to this.

C. Boundary conditions

We have so far just discussed the ?eld equations (VII.5) and (VII.7) and their reductions; however these PDEs make sense only once we specify the function space to which their solutions are required to belong. The natural physical condition is that of ?nite energy; we now brie?y discuss what it means in terms of our equations and the boundary conditions it imposes on their solutions. The ?eld equations (VII.5), (VII.7) are Euler-Lagrange equations for the Lagrangian obtained as continuum limit of (III.1). In the present case, the ?nite energy condition corresponds to requiring that for large |z| the kinetic energy vanishes and the con?guration correspond to points of minimum for the potential energy. The condition on kinetic energy yields t (, t) = 0 , t (, t) = 0 , (VII.12)

where of course t (, t) stands for limz t (z, t), and so for t . As for the condition involving potentials, by the explicit expression of our potentials, see above, this means (with the same shorthand notation as above) (, t) = 0 , (, t) = 2n , z (, t) = 0 , z (, t) = 0 ; (VII.13)

Let us now consider the reduction to travelling waves, i.e. eqs. (VII.10) and (VII.11). In this framework, conditions (VII.12) and (VII.13) imply we have to require the limit behavior described by ?() = 0 , ?() = 2n , ? () = 0 , ? () = 0 , (VII.14)

for the functions ?(), ?(). We would like to stress that eqs. (VII.10) and (VII.11) can also be seen as describing the motion (in the ?ctitious time ) of point masses of coordinates ?(), ?() in an e?ective potential; such a motion can satisfy the boundary conditions (VII.14) only if (?, ?) = (0, 2k) is a point of maximum for the e?ective potential. This would provide the condition ? < 0 and hence a maximal speed for soliton propagation (as also happens for the standard Y model); we will not discuss this point here, as it is no variation with the standard Y case, and the condition ? < 0 is satis?ed with the values of parameters obtained and discussed in Sect. V. The solutions satisfying (VII.14) can be classi?ed by the winding number n := n+ ? n? . Needless to say, here we considered the equations describing symmetric or antisymmetric solutions, but a similar discussion also applies to the full equations, i.e. those in which we have not selected any symmetry of the solutions; in this case we would have two winding numbers, which we can associated either to ?(1) , ?(2) or directly to their symmetric and antisymmetric combinations ?(1) ?(2) .

18

VIII. COMPARISON WITH THE YAKUSHEVICH MODEL

The standard Y model [52] can be seen as a particular limiting case of our model. Thus, a check of the validity of our model C and in particular of the fact we considered here physical assumptions which correspond to those by Y in her geometry C can be obtained by going to a limit in which our composite Y model reduces to the standard Y model. The latter can be obtained as a limiting case of our composite model in two conceptually di?erent ways. ? A ?rst possibility, which we call parametric, is to choose the geometrical parameters of the model so that its geometry actually reduces to that of the standard Y model. ? A second possibility, which we call dynamical is to force the dynamics of our model by setting ?a = 0, i.e. by freezing the non-topological angles and constraining them to be zero. Let us brie?y discuss these in some more detail. The parametric way to recover the standard Y model from our model consists in setting to zero the radius of the disks modelling the bases, and at the same time pushing it on the disk representing the nucleoside on the DNA chain. In this way the base corresponds to a point on the circle bounding the disk representing the nucleoside. Note that this would cause a change in the interbase equilibrium distance, unless we at the same time also change the radius of the disk representing the nucleoside. This limiting procedure corresponds C recalling we also want to recover the Y approximation of zero interbase distance C to the following choice of the parameters: m = Ks = dh = r = 0 , a = R . (VIII.1)

After the base has been pushed on the disk, its mass enters to be part of the disks mass C hence contributes to its moment of inertia C and we can thus just take m = 0. Similarly, as the bases have lost their identity and are enclosed in the disk modelling the whole nucleotide, the e?ective stacking interaction has to be physically identi?ed with the torsional interaction of the disk now modelling the entire nucleotide. For this reason in our equations we will take Ks = 0 and Kt Ks . Use of equations (VIII.1) into the equations of motion (IV.2) yields I i

(a)

= Ks sin(i?1 ? i ) + Ks sin(i+1 ? i ) + + Kp R2 sin(i

(a)

(a)

(a)

(a)

(a)

? i ) + Kh i+5 ? 2i

(? ) a

(? ) a

(a)

+ i?5 .

(? ) a

(VIII.2)

The previous equations represents the equations of motions for the Y model. Some care has to be used when the values of the parameters given by Eq. (VIII.1) correspond to singular points of the equations. This is for instance the case of the dispersion relations 1 ,3 in Eqs. (VI.9), which are singular for m = 0. The dispersion relations for the Y model can be easily found by linearizing the system (VIII.2). One ?nds two dispersion relations; one is given by 2 of the composite model, see Eqs. (VI.9); the other is 2 = 2R2 Kp 4Ks k 4Kh 5k + sin2 ( ) + sin2 ( ). I I 2 I 2 (VIII.3)

To obtain the Y model dynamically from our model, we set = 0 into the continuum equations (VII.5) and (VII.7). We also enforce the Y condition R + dh = a and work in the zero radius approximation for the disk modelling the base, i.e we set r = dh . In the fully symmetric case we get from Eqs. (VII.5) mtt = ?2Kp sin() + 2 Ks xx ;

I (R+r)2

+ m tt = ?2Kp sin + 2

Kt (r+R)2

+ Ks xx .

(VIII.4)

Compatibility of the previous two equations requires that I tt = 2 Kt xx . In the case of travelling wave solutions (x, t) = ?(x ? vt) the constraint (VIII.5) reads v2 = 2 Kt . I (VIII.6) (VIII.5)

We take from now on the positive determination of velocity for ease of discussion. Using the Eqs. (VII.8), (VII.9) and (VIII.6), Eqs. (VIII.4) yields the travelling wave equation, ? = ?2(Kp /?0 ) sin ? (VIII.7)

19 where ?0 = (mKt ? IKs ) 2 . I (VIII.8)

With the usual boundary conditions ? () = 0, ?() = 2, ?(?) = 0, Eq. (VIII.7) has a solution for ?0 < 0, given precisely by the (1, 0) Yakushevich soliton ? = 4 arctan e4z , = 2Kp |?0 | . (VIII.9)

We have thus recovered for the topological angles C imposing the vanishing of non-topological angles as an external constraint C the Y solitons. The condition for the existence of the soliton, ?0 < 0, implies that the physical parameters of our model must satisfy the condition Kt Ks < . I m With the parameter values given in the tables I and III, we have mKt ? 0.3 < 1 ; IKs (VIII.11) (VIII.10)

hence (VIII.10) is satis?ed and we are in the region of existence of the soliton. Let us now consider the antisymmetric equations. Using ? = 0 into the system (VII.11) and the compatibility equation (VIII.6) we get ? = ?2(Kp /?0 ) (1 ? cos ?) sin ? . (VIII.12)

As expected this equation, with the usual boundary conditions (see above), admits a solution for ?0 < 0, and the solution is in this case given by (0, 1) Yakushevich soliton ? = ?2arccot [?z] (VIII.13)

(with as above). The allowed values of the physical parameters are determined by the same arguments used for the (1, 0) soliton. It should be stressed that in the standard Y model the travelling waves speed is essentially a free parameter, provided the speed is lower than a limiting speed [22, 23]. Here, recovering the standard Y model as a limiting case of the composite model produces a selection of the soliton speed given by Eq. (VIII.6); this makes quite sense physically, as it coincides with the speed of long waves determined by the dispersion relations (VI.11).

IX. NUMERICAL ANALYSIS OF SOLITON EQUATIONS AND SOLITON SOLUTIONS

Even after the several simplifying assumptions we made for our DNA model, the complete equations of motions given by Eqs. (VII.5) and (VII.7) respectively for symmetric and antisymmetric con?gurations, are too complex to be solved analytically; the same applies to the reduced equations (VII.10), (VII.11) describing soliton solutions. We will thus look for solutions, and in particular for the soliton solutions we are interested in, numerically. In order to determine the pro?le of the soliton solutions we will analyze the stationary case, with zero speed and kinetic energy, and apply the conjugate gradients algorithm (see e.g. [28, 35]) to evaluate numerically the minima of our Hamiltonian This approach also allows a direct comparison with the results obtained for the standard Yakushevich model, and shown in [57], where authors proceed in the same way and by means of the same numerical algorithm; this again with the aim, as remarked several times above, to emphasize the di?erences which are due purely to the di?erent geometry of our composite model. With the same motivation, we have also checked our numerical routines by applying them to the standard Y model; in doing this we have also considered with some care C and fully con?rmed C certain nontrivial e?ects mentioned in [57].

20

A. Solitons in the standard Yakushevich model

As mentioned above, we will ?rst present the numerical investigation of the stationary solitons of Yakushevich model. Although the soliton solutions of the Yakushevich model have already been the subject of previous numerical investigations [57], it is useful to repeat the analysis here in order to check our algorithm (and con?rm the results reported by [57]). Moreover, in the next section we will compare the soliton solutions of our composite model with those of the Yakushevich model. We will therefore need an explicit numerical results for the Yakushevich model obtained using our code. The homogeneous Yakushevich Hamiltonian for static solutions, i.e. setting the kinetic term T to zero, is

I HY = 2 = I

(i+1 ? i )2 + g 2 2 n [?n + ?n ] + g

i,a

(a)

(a)

i,a

sin2 ((i+1 ? i )/2) + K n [1 ? cos ?n cos ?n ] + 2K

(a)

(a)

i

1 2 1 2 3 + cos(i ? i ) ? 2 cos i ? 2 cos i 2 n 1 ? 2 cos n cos n + cos n .

(IX.1)

Here I is the inertia momentum; K and g respectively the pairing and torsional coupling constants of the Yakushevich model; (1,2) are the angles describing the sugar rotation with respect to the backbone. Moreover, we write = + , = ? (the are de?ned as in Eq. (VI.3)), and ?n := n+1 ? n and similarly for . If we select the physical value for I (given in table I) and factor out the K (equivalently, we measure energies in units of K; this takes the value K = 150KJ/mol), then the only independent parameter left in H is the coupling constant g. This can and will be used, as in [57], for a parametric study of the solutions. In our numerical investigations we used (in order to avoid any accidental spurious e?ect) two independent implementations of the conjugate-gradient algorithms, i.e. the one developed by Numerical Recipes [35] and the one provided by the GNU [28]. The results obtained with the two implementations turned out to be in very good agreement. The conjugate-gradient algorithms requires a starting point, i.e. an initial approximation of the minimizing con?guration; in the case of multiple minima, the algorithm will actually determine a local minimum or the other depending on this initial approximation. Following [57], we have used as starting points hyperbolic tangent pro?les

1,2 n = q 1,2 (1 + tanh((2n ? N ))) ,

(IX.2)

where (q 1 , q 2 ) is the topological type of the soliton, N is the number of sites in the chain, and a parameter, whose reasonable range is about 0 N/10, used to adjust the pro?le. The parameter is crucial to determine the structure of the minima (hence of the solitonic solutions) of the Yakushevich Hamiltonian. Choosing di?erent ranges of variation for we are probing di?erent dynamical regions of the system, where di?erent local minima of the Hamiltonian may be present. In the case of the elementary solitons C the (1, 0) and the (0, 1) ones C for which only one degree of freedom matters, we obtain the same local minimum, up to 10?5 in the energy and 10?2 in the angles, independently of (provided of course that is not too close to zero; in our case it su?ces to keep 4) in agreement with the fact that in this case the minimum is known to be global.

1. Quasi-degeneration of the energy minimizing con?gurations for higher topological numbers

The situation appears to be di?erent in the case of the (1, 1) soliton. Here in facts numerical investigations show a strong dependence on of the local minimum determined by the algorithm for most of its range, in particular for > 6, while energies vary very little C within 0.2% C about 49.3K. This suggests we are in the presence of a rather complex structure of the phase space for the (1,1) limiting conditions. There still is however a small interval, (4 6), where the algorithm behaves exactly as in the (1, 0) and (0, 1) cases, i.e. yields almost the same energy minimizing con?guration as is varied, and allows us to ?nd what we believe is the global minimum of the Hamiltonian. We have detected this same behavior also in solitons of higher topological types, e.g. (2, 0) and (2, 1). This suggests we are in the presence of a generic pattern [71]; we point out that the reason why this same behavior is not shared by the solitons of types (1, 0) and (0, 1) is related to the fact that in these two cases (and only in them) the problem reduces actually to a one-dimensional dynamics, while all others cases are intrinsically two-dimensional.

2. Numerical instability of the solitons

A noteworthy fact pointed out in [57] is that the discrete version of the soliton solutions lose stability, namely switch from minima to saddle points, as g gets close to 0, a phenomenon that is not shared by its continuos counterpart. We

21

7 6 5 4 3 2 1 100 -1 200 300 400 500 600

7 6 5 4 3 2 1 100 -1 200 300 400 500 600

a

7 6 5 4 3 2 1 100 -1 200 300 400 500 600 -1 7 6 5 4 3 2 1 100 200

c

300

400

500

600

b

d

FIG. 4: Static solitons pro?les for the Yakushevich homogeneous Hamiltonian. Thicker lines correspond to g = 150, thinner ones to g = 23.4; dashed ones to the angle 1 and continuous ones to the angle 2 . Topological numbers are as follows. Upper left (a): (1,0); lower left (b): (0,1); upper right (c): (1,1); lower right (d): (1,1). Picture (d) is obtained by a computation where 2 has been approximated with 6.28, and shows a sensitive dependence of the solution on the boundary conditions. The 150 150 150 23.4 23.4 g energies E(p,q) of the solitons are: E(1,1) = 153.4 K, E(0,1) = 62.93 K, E(1,0) = 62.93 K, E(1,1) = 59.32 K, E(0,1) = 24.63 K, E(1,0) = 24.63 K. Energies are measured in units of K = 150KJ/mol.

23.4

6 5 4 3 2 1 50 100 150 200 250 300

6 5 4 3 2 1 50 100 150 200 250 300

6 5 4 3 2 1 50 100 150 200 250 300

a

b

c

FIG. 5: Static solitons pro?les for the Yakushevich homogeneous hamiltonian expressed in the coordinates (, ). Thicker lines correspond to g = 150, thinner ones to g = 23.4; dashed ones to the angle and continuous ones to the angle . Topological 150 150 g numbers are: (1,0) in (a), (0,1) in (b), (1,1) in (c). The energies E(p,q) of the solitons are: E(1,1) = 62.93 K, E(0,1) = 153.4 K, E(1,0) = 195.3 K, E(1,1) = 24.64 K, E(0,1) = 59.34 K, E(1,0) = 75.56 K. Energies are measured in units of K = 150KJ/mol.

150 23.4 23.4 23.4

have looked for this e?ect and con?rmed the ?ndings of [57]. In our numerical computations the g values at which the transition takes place turned out to be di?erent if computations are performed in the 1,2 coordinates or in the (, ) ones. Transition values corresponding to the onset of the numerical instability are given in Table IV The transition values are, in the (1,2 ) coordinates, g = 14.7 for the (1, 1) soliton and g = 7.05 for both the (1, 0) and (0, 1) ones; in the (, ) coordinates they are g = 7 for the (1, 1) soliton, g = 16.2 for the (0, 1) one and g = 14.7 for the (1, 0) one. When the instability sets in, what we observe is that the soliton C i.e. the discrete con?guration

22

(1,0) (0,1) (1,1) (1 , 2 ) 7.05 7.05 14.7 (, ) 14.7 16.2 7.0 TABLE IV: The transition values of g0 for instability (arising for g < g0 ) of the (p, q) solitons.

smoothly interpolating between the boundary values C breaks down and we have instead a con?guration in which all angles are equal to the left boundary value for n n0 , and to the right boundary value for n > n0 . In other words, the transition between the two boundary value does not take place over a (more and less extended) range of sites, but abruptly at a given site C which we denoted above as n0 , but of course can be any site. This also shows that in this case we have a strong degeneration of the Hamiltonian also in the ?nite length case (for in?nite chains, this is always the case as the Hamiltonian is invariant under translations), which will show up in a computational instability for the numerical energy minimization. In Fig. 4 we show the pro?les of the Yakushevich solitons we have obtained for g = 23.4 (this is the value corresponding to the coupling constants in use in our model, see Eq. (IX.8) below) and g = 150 (this corresponds to the coupling constant value chosen in [57]). The pro?les we obtain are qualitatively identical to the inhomogeneous ones presented in [57]. Incidentally, we noticed a peculiarly strong dependence on the initial conditions for the (1, 1) mode, so that those pro?les change considerably depending on the approximation chosen for 2: e.g., in Fig. 4c we used an approximation extremely precise while in Fig. 4d we used 2 = 6.28. We believe that the ?rst one represents the correct solution, e.g. also because it is the only one of the two that respects the symmetry of the equations. We also produced pro?les for the solitons of the Yakushevich Hamiltonian with respect to the angles , (which are the coordinates we use in our Hamiltonian). The results are show in Fig. 5; in these new coordinates no strong dependence on the initial conditions was spotted.

B. Solitons in the composite model

Let us now turn to the numerical investigation of our model. We will consider the case when the intrapair distance at the equilibrium is zero, i.e we will set a = r + dh (contact approximation). Notice that we are not considering the zero-radius approximation for the bases, so that in general r = dh . The Hamiltonian of the system can be easily derived from the Lagrangian (III.1) and is given by H = TB + Tb + Vt + Vs + Vp + Vh + Vw . We use the shorthand notation

+ ? n = n , = n , n = ?+ , n = ?? ; n n ?n = n+1 ? n , Sn = n+1 + n ;

(IX.3)

and similarly for the other variables; we also write = R/r , = R/dh ; with the values given in table I, it results = 0.92, = 0.53. With these notations, we have TB = Tb = ?n 2 + ?n 2 2 2 2 2 2 2 2 n Ib [?n + ?n + ?n + ?n + 2?n ?n + 2?n ?n + (?n + ?n ) 2 2 +2(?n + ?n + ?n ?n + ?n ?n ) cos n cos n +2(2?n ?n + ?n ?n + ?n ?n ) sin n sin n ] 2Kt n [cos ?n cos ?n ? 1] 1 2 2 2 n 4[1 + ? cos(?n ) cos(?n ) ? cos(?n + ?n ) cos(?n + ?n ) 2 Ks r +2 cos((Sn ? Sn )/2) sin((?n ? ?n )/2) sin((?n ? ?n + ?n ? ?n )/2) +2 cos((Sn + Sn )/2) sin((?n + ?n )/2) sin((?n + ?n + ?n + ?n )/2)] 1 2 2 2 2 2 n 4[(1 + ) + cos (n + n ) + 2 cos n cos n cos(n + n ) + cos n 2 Kp dh ?2(1 + ) cos n cos n ? 2(1 + ) cos(n + n ) cos(n + n )] Kh n [2 ? cos(n+5 ? n ) ? cos(n+5 ? n )] Kw n [tanh(n + n ) + tanh(n ? n )]

n IB

Vt = Vs =

(IX.4)

Vp = Vh = Vw =

23

7

6

5

4

3

2

1

5

10

15

20

25

FIG. 6: Region of instability (black region) for the discrete solitons of topological type (0, 1). in the (gt , gs ) plane. Et Es Ep Eh Ew 2.6 102 KJ/mol 2.9 103 KJ/mol 4 102 KJ/mol 10KJ/mol 4 10?1 KJ/mol TABLE V: Values of the typical energies characterizing the di?erent interactions in the Hamiltonian of Eq. IX.4

Note that we have inserted in the Hamiltonian the con?ning potential Vw in order to implement dynamically the constraint (III.2) for the non-topological angles ?(a) . Adding this term in the potential is also instrumental in stabilizing the numerical minimizations. The Hamiltonian (IX.4) reduces to that of the Yakushevich model setting = = 0 (and disregarding the helicoidal term); with this we get TB Tb Vt Vs Vp Vw = = = = = = IB n ?n 2 + ?n 2 Ib (1 + 2 ) n [?n 2 + ?n 2 ] 2Kt n [cos ?n cos ?n ? 1] 1 2 2 n 4(1 + ) [1 ? cos ?n cos ?n ] 2 Ks r 1 2 2 2 n 4(1 + ) [1 ? 2 cos n cos n + cos n ] 2 Kp dh 0;

(IX.5)

Also note that in this case Vt and Vs di?er just by a multiplicative function. The typical energies involved in the di?erent interactions are given by the coe?cients in Eq. (IX.4), Et = 2Kt , Es = 1 1 Ks r2 , Ep = 2 Kp d2 , Eh = Kh , Ew = Kw , which represent, respectively the typical torsional, stacking, pairing, h 2 helicoidal and con?ning energies. Using the values of the physical parameters given in the tables I, III, and choosing Ew in order to keep the con?ning energy at least a full order of magnitude smaller than any other one, we get for the typical interaction energies the values given in table V. In order to work with dimensionless quantities, throughout this section we will measure energies in terms of Ep = (1/2)Kp d2 = 4.0 102 KJ/mol. Using the values of the kinematical parameters given in II and those of the dynamical h parameters given by table III, the dimensionless coupling constants turn out to be gt = Et /Ep = 0.65 , gs = Es /Ep = 7.2 , gp = 1 , gh = Eh /Ep = 0.026 , gw = Ew /Ep = 0.001 . (IX.6)

Note that Eq. (IX.5) implies that, in the limit = = 0, the Yakushevich couplings (K, g) and our coupling constants are related by K = 2(1 + )2 gp , g = 2 gt + 4(1 + )2 gs ; this also yields gt + 7.4gs gt + 2(1 + )2 gs g ? ? 23 . = 2g K (1 + ) p 2.3gp (IX.8) (IX.7)

24

2 1.5 1 0.5 50 -0.5 -1 -1.5 -2 100 150 200 250 300 350 400

6 5 4 3 2 1 100 200 300 400

a

2 1.5 1 0.5 50 -0.5 -1 -1.5 -2 100 150 200 250 300 350 400 -0.5 -1 -1.5 -2 2 1.5 1 0.5 50 100 150

b

200

250

300

350

400

c

d

FIG. 7: Stationary solitonic solutions of our model with energy E = 80.06 Ep of topological numbers (1, 0) (thick line) compared with the solitonic solutions of the Yakushevich model of energy E = 75.56 Ep (thin line). Upper left (a): the angle ; upper right (b): the angle ; lower left (c): the angle ; lower right (d): the angle . The thin line segments visible in (b) show the small di?erence between the pro?le of the solitons of our and of the Yakushevich model.

6 5 4 3 2 1 100 200 300 400 -0.5 -1 -1.5 -2 2 1.5 1 0.5 50 100 150 200 250 300 350 400

FIG. 8: Stationary solitonic solutions of our model with topological numbers (0, 1) for di?erent values of the normalized couplings. The angle is depicted on the left whereas is depicted on the right. The soliton relative to the physical coupling constants with energy E = 189.9Ep (thick line) is shown together with those relative to the coupling constants gt = 0, gs = 46 with energy E = 492.4Ep (thin dashed line) and gt = 345, gs = 0 (thin line, E = 388.5Ep ) to show how the pro?le would change at the increasing of the coupling constants in the two extreme cases of negligible torsional or stacking interactions.

Most of the statements made in the previous section for the Yakushevich Hamiltonian apply almost verbatim to our case. We obtain an approximate pro?le of a soliton, subject to the boundary conditions (VII.14), by minimizing numerically the Hamiltonian through the conjugate-gradient algorithm, in particular through its implementations in the GSL [28] and in the Numerical Recipes [35]. To enforce a particular topological type (p, q) for the soliton under study we ?x the angles at the extremes of the chain so that ? = ? = 0 and + = 2p, + = 2q, while the non-topological angles are requested to be identically zero at the extremes. As starting point for the the algorithm (see above) we use the natural choice [57] n = p(1 + tanh((2n ? N ))) , n = q(1 + tanh((2n ? N ))) , n = n = 0 , (IX.9)

where is a parameter used to adjust the pro?le of the initial con?guration (the starting point) and N is the number of sites on the chain. The number N is of course much smaller than in real DNA (usually N = 4000 in our simulations) but big enough to ensure that and are constant at the beginning and the end of the chain within the numerical precision of our computations. Like in previous case, there is a threshold for the coupling constants that must be surpassed for the solitons to be

25 stable; in Fig. 6 we show the region of instability for solitons in the (gt , gs ) plane when we ?x the values of the other coupling constants to be gh = 0.026, gp = 1, gw = 0.001. As above, in the case of solitons of topological type (1, 0) and (0, 1) we always reach the same minimum C within 10?5 in the energy and 10?2 in the angle C while varies across almost two orders of magnitude, provided that 4 to avoid falling on the step solution. The (1, 1) soliton also shows the very same behavior as for the Yakushevich Hamiltonian, namely a di?erent local extrema for every value of except for a short interval 4 5.5 and for a range of energies wider C within 2% C about 160K; in the latter we get the same stable behavior observed for the (1, 0) and (0, 1) solitons. In this case again, as in the Yakushevich case, the sensitivity of the numerical solitonic solutions to the values of the parameter could be an indication of the existence of many, almost degenerate, solitonic solution for topological numbers di?erent than (1,0) or (0,1). We would like to stress that that the existence of di?erent, almost degenerate, local minima is a typical feature of most bio-physical systems. However, our results concerning these point have to be regarded just has an indication. Further investigations, in particular at the analytical level, are needed in order to draw a de?nite conclusion. In Fig. 7 we plot the (1, 0) soliton of our model with the physical values of the normalized coupling constants given by Eq. (IX.6) and compare them with those obtained for the Yakushevich model. The pro?les of the topological angles change very little from the corresponding pro?les of the Yakushevich solitons. We have also investigated the deformation of soliton pro?les when adjusting selected parameters of our Hamiltonian. First, in order to see how the shape of the soliton changes upon increasing the strength of the torsional/stacking interactions, in Fig. 8 we compare the pro?les of the (0, 1) solitons with pro?les corresponding to g/K ? 150 (see Eq. (IX.8), i.e. the coupling constant used in [57]. As it is not completely clear how to separate the interaction strength between torsional and stacking interactions (for our choice of physical constants in Sect. V) we have used about the smallest reasonable value for Kt . We present the pro?les corresponding to the two extreme possibilities: the one in which we put all the strength in the backbone torsional interaction (gt = 345, gs = 0), and the one in which we put all of it in the bases stacking (gt = 0, gs = 46). The e?ect is the widening of both the soliton and the non-topological pro?les by roughly a factor 4 in the ?rst case and of a factor 2 in the second case. In Fig. 9 we compare the pro?les of the soliton (1, 1) with those obtained by using the correct distance function for Vp , namely by replacing gp 2 with gp ( ? d0 /dh )2 (where d0 ? 2 ? is the equilibrium distance between two bases A in a pair [72]), and by varying the helicoidal interaction term. No relevant changes are detected in the ?rst case: relative di?erences in energies and angles are of the order of 10?2 in energy and 10?1 in the angles; even increasing the base-pairs distances by two orders of magnitude these results do not modify the situation. As for the helicoidal term, we get variations of the same order of magnitude as above if we simply turn it o?. If we instead increase the coupling constant by one order of magnitude (gh = 0.26), then we get energy and angles changes of the order of 10?1 and by increasing it to gh = 1 we arrive to changes of the order of 100 in both the angles and the energy. Raising gh up to gh = 2 leads to the disappearance of the soliton; it seems reasonable to argue that this is due to such an interaction favoring a sharper transition between limit behaviors, so that the discreteness e?ect discussed in the previous subsection arises.

C. Discussion

The numerical analysis we have performed shows the existence of solitonic solutions of our composite DNA model. The pro?les of the topological solitons C in particular, the part relating to the topological degree of freedom C of our model are both qualitatively and quantitatively very similar to those of the Y model. This means that the most relevant (for DNA transcription) and characterizing feature of the nonlinear DNA dynamics present in the Y model is preserved by considering geometrically more complex and hence more realistic DNA models. Moreover, the topological soliton pro?les of our model seem to change very little when either the physical parameters change in a reasonable range or also the form of the potential modelling the pairing interaction is modi?ed to a more realistic form. In particular, the form of the topological solitons are very little sensitive to the interchange of torsional and stacking coupling constant. This feature add other reasons why the Y model, although based on a strong simpli?cation of the DNA geometry, works quite well in describing solitonic excitations. The Y model, indeed, does not distinguish between torsional and stacking interaction; but, as we have shown, this distinction is not relevant C at least as long as one is only interested in the existence and form of the soliton solutions. The compositeness of our model becomes relevant C and rather crucial C when it comes on the one hand to allowing the existence of solitons together with requiring a physically realistic choice of the physical parameters characterizing the DNA, and on the other hand to have also predictions ?tting experimental observations for what concerns quantities related to small amplitude dynamics, such as transverse

26

6 5 4 3 2 1 100 200 300 400

6 5 4 3 2 1 100 200 300 400

a

2 1.5 1 0.5 20 -0.5 -1 -1.5 -2 40 60 80 100 -0.5 -1 -1.5 -2 2 1.5 1 0.5 20 40

b

60

80

100

c

d

FIG. 9: Comparison of stationary solitonic solutions of our model with those obtained using a modi?ed pairing potential Vp . Upper left (a): the angle ; upper right (b): the angle ; lower left (c): the angle ; lower right (d): the angle . The thick line represent a soliton with E = 80.06 and topological numbers (1, 1)). The thin dashed line gives the pro?le for the same soliton with energy E = 74.41Ep and with the correct pairing potential Vp = gp ( ? d0 /dh )2 . The latter solitonic solution has been derived taking d0 = 3.2dh , namely a order of magnitude bigger than its physical value, to enhance the pro?le di?erences (an almost identical pro?le is obtained if we suppress the helicoidal term from the Hamiltonian). The thin continuos line (E = 102.9Ep ) is the pro?le we get by increasing the helicoidal term to gh = 1.

phonons speed. In other words, the somewhat more detailed description of DNA dynamics provided by our model allows it to be e?ective C with the same parameters C across regimes, and provide meaningful quantities in both the linear and the fully nonlinear regime. The solitonic solutions of the composite model share also two other features with those of the Y model, namely the presence of a numerical instability and the existence of quasi-degenerate solutions for solitonic con?gurations with higher topological numbers. We expect that the model considered here is the simplest DNA model describing rotational degrees of freedom which, with physically realistic values of the coupling constants and other parameters, allows for the existence of topological solitons and at the same time is also compatible with observed values of bound energies and phonon speeds in DNA.

X. SUMMARY AND CONCLUSIONS

Let us, in the end, summarize our discussion and the results of our work, and state the conclusions which can be drawn from it.

A. Summary and results

Following the work by Englander et al. [17], di?erent authors have considered simple models of the DNA double chain C focusing on rotational degrees of freedom C able to support dynamical and topological solitons [56], supposedly related to the transcription bubbles present in real DNA and playing a key role in the transcription process. These models usually consider a single (rotational) degree of freedom per nucleotide [56], albeit models with one rotational and one radial degree of freedom per nucleotide have also been considered [3, 4, 11, 31, 53] (as an extension of purely radial models [39, 41], considered in the study of DNA denaturation). A simple model which has been studied in depth is the so called Y model [52]. This supports topological solitons (of sine-Gordon type) and provides correct orders of magnitude for several physically relevant quantities [26]; on the other hand, the soliton speed remains

27 essentially a free parameter [22, 23], and the speed of transverse phonons can be made to have a physical value only by assigning unphysical values to the coupling constants of the model [57]. Here we have considered an extension of the Y model, with two degrees of freedom C both rotational C per nucleotide; one of these is associated to rotations of the nucleoside (unit of the backbone) around the phosphodiester chain and is topological C i.e. can go round the S 1 circle C while the other is associated to rotations of the attached nitrogen base around the C1 atom in the sugar ring, and due to sterical hindrances is non-topological C i.e. rotations are limited to a relatively small range around the equilibrium position. We denoted this as a composite Y model. Several parameters appear in the model; some of these are related to the geometry and the kinematics of the DNA molecule, while other are coupling constants entering in the potential used to model intramolecular interactions. We have assigned values to the ?rst kind of parameters following from available direct experimental observations, and for the second kind of parameters we used experimental data on the ionization energies of the concerned couplings and the form of the potentials appearing in the model. That is, these parameters were not chosen by ?tting dynamical predictions of the model; see sect. V for detail. We have ?rst considered small amplitude dynamics (sect. VI); this yields the dispersion relations and produced some prediction on the phonon speed and the optical frequency for the di?erent branches. These prediction are a ?rst success of the present model, in that it was shown in sect. VI that taking the physical order of magnitude for the pairing, stacking, torsional and helicoidal interaction, one can obtain the order of magnitude of the experimentally observed speed for transverse phonon excitations and the frequency threshold for the optical branch. In particular, using a physical value for the stacking and torsional interaction energy we get a value for the transverse phonon speed which is about three times the correct one. It should be considered, in looking at this value, that we modelled the intrapair interaction by a very simple and non realistic potential (with the aim of both keeping computations simple and allowing direct comparison with the standard Y model by making the same simplifying assumptions as there). As for comparison with standard Y model, hence for an evaluation of the advantages brought by considering a more articulated geometry of the nucleotide, it should be recalled that the numerical computations of Yakushevich, Savin and Manevitch [57] (which we repeated, and fully con?rmed) show in order to obtain the experimentally observed speed for transverse phonon excitations in the framework of the standard Y model, one should take a coupling constant for the transverse intrapair interaction which is about 6000 times the correct one. We passed then to consider the fully nonlinear regime, and in particular to look for solitonic-like travelling excitations. These should have smooth variations on the space scale of nucleotides, hence we passed to a continuum description and ?eld equations; by using the chain exchange symmetry, we considered fully symmetric and antisymmetric reductions, see (VII.5) and (VII.7). By a travelling wave ansatz we reduced these to a system of two coupled second order ODEs for ?(z) and ?(z), see (VII.10) and (VII.11). Here ? is the topological angle, i.e. the variable associated to the topological ?eld, and ? is the non topological angle, i.e. the variable associated to the non topological ?eld. The ?nite energy condition (VII.12), (VII.13) requires that the solutions to this system of ODEs satisfy certain limit conditions (see (VII.14 ). These in turn imply that solutions satisfying them can be classi?ed according to two topological indices (winding numbers for the topological ?elds; in the symmetric or antisymmetric case, one index is enough to determine the other as well). We have also shown that the standard Y model can be obtained from the composite Y model by a limiting procedure (sect. VII); this also reduces the solitons of the composite model to solitons of the Y model. However, the limiting procedure requires that a certain condition is satis?ed, see eq. (VIII.5), and this in turn constraints the speed of solitonic excitations; see (VIII.6). Thus, the requirement to obtain the standard Y solitons in a certain limit ?xes the speed of solitons; the resulting speed is just the speed of long waves as determined by the dispersion relations. Finally, in sect. IX we conducted a careful numerical investigation of the simpler soliton solutions for the composite Y model. We preliminarily checked our numerical routines on the standard Y model and fully con?rmed the results of Yakushevich, Savin and Manevitch [57], also con?rming certain instability phenomena they reported. We then considered the solitons for the composite Y model with the value of parameters descending from their physical meaning (i.e. with no parameter ?tting), con?rming their existence, properties and stability. We also showed how the pro?le of the soliton component corresponding to the topological degree of freedom is extremely similar to the standard Y soliton with same topological numbers. We considered next the stability of these soliton solutions upon varying the parameters of the model, and observed that as in the standard Y case there is a stability threshold. Thus, the existence and stability of soliton solutions for physical values of the parameters is a nontrivial prediction.

B. Discussion and conclusions

The composite Y model considered here retains all the favorable features of the standard Y model. At the same time, its more articulated geometry allows at the same time C and with physical values of the coupling constants

28 and other parameters entering in the model C to reproduce relevant value of physical quantities related to the linear regime (such as speed of transverse phonon, which was a critical test for the standard Y model) and support stable soliton solutions. Further, and at di?erence with the standard Y model, it provides a precise prediction for the soliton speed; this is quite reasonable physically, as it corresponds to the speed of long waves as obtained from the dispersion relations for the model. Thus, our model passed some C in our opinion, signi?cant C quantitative test and provides precise predictions. It should also be stressed that we used C both to simplify the mathematics and to have a direct comparison with the standard Y model C a very simple form for the intrapair coupling potential (and at some stage also resorted to the contact approximation to get simpler formulas, again as in the standard Y model treatment). It is quite conceivable that adopting a more realistic potential will provide better estimates of relevant physical quantities, in particular for quantities related to the linear regime. However, experience recently gained with the standard Y model [24, 25] suggests that the predictions related to the fully nonlinear regime are rather little sensitive to the detailed form of the potential and to adopting or otherwise the contact approximation; we are thus rather con?dent that future work with more realistic potentials will con?rm the results obtained in the simple setting considered here. Finally, we would like to remark a very relevant feature of our model. All the DNA models amenable to analytic treatment look at homogeneous DNA, albeit the genetic information lies precisely in the non homogeneous part of the DNA (i.e. the base sequence; bases have rather di?erent physical and geometrical characteristics). Our discussion was no exception, and we considered identical bases with average geometrical and physical characteristics. But, the degrees of freedom we considered for each nucleotide are one concerned with the uniform part of the DNA molecule (the nucleosides), the other with the non homogeneous part (the base sequence). Moreover, it turned out that C for what concerns soliton excitations C the most relevant role is played by the (topological) variables associated to the uniform part, which are directly at play in the topological solitons, while the (non topological) variables associated to the non uniform part are in a way just accompanying the soliton. This suggests that, within the framework of composite models, the non homogeneous case can be studied as a (nonsingular) perturbation of the homogeneous case; needless to say, by this we mean an analytical C albeit approximated C study, and not just a numerical one. This represents a signi?cant advance with respect to what is possible with simple models considered so far.

Acknowledgements

This work received support by the Italian MIUR (Ministero dellIstruzione, Universit` e Ricerca) under the program a COFIN2004, as part of the PRIN project Mathematical Models for DNA Dynamics (M 2 D2 ).

[1] G. Altan-Bonnet, A. Libchaber and O. Krichevsky, Bubble dynamics in double-stranded DNA, Phys. Rev. Lett. 90 (2003), 138101 [2] N.K. Banavali and A.D. MacKerell Jr., Free energy and structural pathways of base ?ipping in a DNA GCGC containing sequence, J. Mol. Bio. 319 (2002), 141-160 [3] M. Barbi, S. Cocco and M. Peyrard, Helicoidal model for DNA opening, Phys. Lett. A 253 (1999), 358-369; Vector nonlinear KleinGordon lattices: general derivation of small amplitude envelope soliton solution, Phys. Lett. A 253 (1999), 161-167 [4] M. Barbi, S. Cocco, M. Peyrard and S. Ru?o, A twist-opening model of DNA, it J. Biol. Phys. 24 (1999), 97-114 [5] M. Barbi, S. Lepri, M. Peyrard, and N. Theodorakopoulos, Thermal denaturation of a helicoidal DNA model, Phys. Rev. E 68 (2003), 061909 [6] M.D. Barkley and B.H. Zimm, Theory of twisting and bending of chain macromolecules; analysis of the ?uorescence depolarization of DNA, J. Chem. Phys. 70 (1979), 2991-3007 [7] N. Bruant, D. Flatters, R. Lavery and D. Genest, From atomic to mesoscopic descriptions of the internel dynamics of DNA, Biophysical Journal 77 (1999), 2366-2376 [8] A. Campa, Bubble propagation in a helicoidal molecular chain, Phys. Rev. E 63 (2000), 021901 [9] C. Calladine and H. Drew, Understanding DNA, Academic Press (London) 1992; C. Calladine, H. Drew, B. Luisi and A. Travers, Understanding DNA (3rd edition), Academic Press (London) 2004 [10] Y.Z. Chen and E.W. Prohofsky, Theory of presure-dependent melting of the DNA double helix: role of straining hydrogen bonds, Phys. Rev. E 47 (1992) [11] S. Cocco and R. Monasson, Statistical mechanics of torque induced denaturation of DNA, Phys. Rev. Lett. 83 (1999), 5178-5181

29

[12] S. Cuenda and A. Sanchez, On the discrete Peyrard-Bishop model of DNA: stationary solutions and stability, preprint q-bio.OT/0511036 (2006), to appear in Chaos [13] Th. Dauxois, Dynamics of breather modes in a nonlinear helicoidal model of DNA, Phys. Lett. A 159 (1991), 390-395 [14] A.S. Davydov, Solitons in Molecular Systems, Kluwer (Dordrecht) 1981 [15] J. De Luca, E.D. Filho, A. Ponno and J.P. Ruggiero, Energy localization in the Peyrard-Bishop DNA model, Phys. Rev. E 70 (2004), 026213 [16] H.R. Drew, R.M. Wing, T. Takano, C. Broka, S. Tanaka, K. Itakura and R.E. Dickerson, Structure of a B-DNA dodecamer: conformation and dynamics, Proc. Natl. Acad. Sci. USA 78 (1981) 2179-2183; [17] S.W. Englander, N.R. Kallenbach, A.J. Heeger, J.A. Krumhansl and A. Litwin, Nature of the open state in long polynucleotide double helices: possibility of soliton excitations, PNAS USA 77 (1980), 7222-7226 [18] V.K. Fedyanin, I. Gochev and V. Lisy, Nonlinear dynamics of bases in continual model of DNA helices, Stud. Biophys. 116 (1984), 59-64; V.K. Fedyanin and V. Lisy, Soliton conformational excitations in DNA, Stud. Biophys. 116 (1984), 65-71 [19] M.D. Frank-Kamenetskii, Biophysics of the DNA molecule, Phys. Rep. 288 (1997), 13-60 [20] G. Gaeta, On a model of DNA torsion dynamics, Phys. Lett. A 143 (1990), 227-232 [21] G. Gaeta, Solitons in planar and helicoidal Yakushevich model of DNA dynamics, Phys. Lett. A 168 (1992), 383-389 [22] G. Gaeta: A realistic version of the Y model for DNA dynamics; and selection of soliton speed; Phys. Lett. A 190 (1994), 301-308 [23] G. Gaeta, Results and limitations of the soliton theory of DNA, Journal of Biological Physics 24 (1999), 81C96 [24] G. Gaeta, Solitons in the Yakushevich model of DNA beyond the contact approximation, preprint q-bio/0604003 (2006) [25] G. Gaeta, Solitons in Yakushevich-like models of DNA dynamics with improved intrapair potential, preprint q-bio/0604004 (2006) [26] G. Gaeta, C. Reiss, M. Peyrard and Th. Dauxois, Simple models of non-linear DNA dynamics, Rivista del Nuovo Cimento 17 (1994) n.4, 1C48 [27] http://chemistry.gsu.edu/glactone/ [28] http://www.gnu.org/software/gsl/manual/gsl-ref 35.html#SEC474 [29] J.A. Gonzalez and M. Martin-Landrove, Solitons in a nonlinear DNA model, Phys. Lett. A 191 (1994), 409-415 [30] C.A. Hunter and J.K.M. Sanders, The Nature of n-n Interactions, J. Am. Chem. Soc. 112 (1990), 5525-5534; R. Khairoutdinov, http://www.uaf.edu/chem/467Sp05/lecture4.pdf [31] M. Joyeux and S. Buyukdagli, Dynamical model based on ?nite stacking enthalpies for homogeneous and inhomogeneous DNA thermal denaturation, Phys. Rev. E 72 (2005), 051902; S. Buyukdagli, M. Sanrey and M. Joyeux, Towards more realistic dynamical models for DNA secondary structure, Chem. Phys. Lett. 419 (2006), 434-438 [32] N. Komarova and A. So?er, Nonlinear waves in double stranded DNA, Bull. Math. Biol. 67 (2005), 701-718 [33] R. Lavery, A. Lebrun, J.F. Allemand, D. Bensimon and V. Croquette, Structure and mechanics of single biomolecules: experiments and simulation, J. Phys.: Condens. Matter 14 (2002), R383-R414 [34] V. Muto, J. Holding, P.L. Christiansen and A.C. Scott, Solitons in DNA, J. Biomol. Str. Dyn. 5 (1988), 873-894; V. Muto, P.S. Lomdahl and P.L. Christiansen, Two-dimensional discrete model for DNA dynamics: longitudinal wave propagation and denaturation, Phys. Rev. A 42 (19890), 7452-7458 [35] Numerical Recipes, see http://www.library.cornell.edu/nr/bookfpdf/f10-6.pdf [36] PDB repository, http://www.rcsb.org/pdb/ [37] PDB ?les at http://chemistry.gsu.edu/glactone/PDB/pdb.html [38] M. Peyrard (editor), Nonlinear excitations in biomolecules, (Proceedings of a workshop held in Les Houches, 1994), Springer (Berlin) and Les Editions de Physique (Paris) 1995 [39] M. Peyrard, Nonlinear dynamics and statistical physics of DNA, Nonlinearity 17 (2004) R1-R40 [40] M. Peyrard ed., Nonlinear phenomena in biology (proceedings of the Pushchino conference, June 23C27, 1998), published as issues 2C4 of J. Biol. Phys. 24 (1999) [41] M. Peyrard and A.R. Bishop, Statistical mechanics of a nonlinear model for DNA denaturation, Phys. Rev. Lett. 62 (1989), 2755-2758 [42] M. Peyrard, A.R. Bishop and Th. Dauxois, Dynamics and thermodynamics of a nonlinear model for DNA denaturation, PRE 47 (1992) [43] J. W. Powell, G. S. Edwards, L. Genzel, F. Kremer, A. Wittlin, W. Kubasek and W. Peticolas, Investigation of far-infrared vibrational modes in polynucleotides, Phys. Rev. A 35 (1987), 3929-3939 [44] E.W. Prohofsky, Solitons hiding in DNA and their possible signi?cance in DNA transcription, Phys. Rev. A 38 (1988), 1538-1541 [45] G. Saccomandi and I. Sgura, The relevance of nonlinear stacking interactions in simple models of double-stranded DNA, preprint 2006, to appeare in J. Royal Soc. Interfaces [46] W. Saenger, Principles of nucleic acid structure, Springer (Berlin) 1984 [47] M. Salerno, Discrete model for DNA-promoter dynamics, Phys. Rev. A 44 (1991), 52925297 [48] S.B. Smith, L. Finzi and C. Bustamante, Direct mechanical measurements of the elasticity of single DNA molecules by using magnetic beads, Science 258 (1992), 1122-1126 [49] T.R. Strick, M.N. Dessinges, G. Charvin, N.H. Dekker, J.F. Allemand, D. Bensimon and V. Croquette, Stretching of macromolecules and proteins, Rep. Prog. Phys. 66 (2003), 1-45 [50] S. Takeno and S. Homma, Topological solitons and modulated structure of bases in DNA double helices, Progr. Theor. Phys. 70 (1983), 308-311; S. Homma and S. Takeno, A coupled base-rotator model for structure and dynamics of DNA,

30

Progr. Theor. Phys. 72 (1984), 679-693 [51] N. Theodorakopoulos, M. Peyrard and R.S. MacKay, Nonlinear structures and thermodynamic instabilities in a onedimensional lattice system, Phys. Rev. Lett. 93 (2004), 258101 [52] L.V. Yakushevich, Nonlinear DNA dynamics: a new model, Phys. Lett. A 136 (1989), 413-417 [53] L.V. Yakushevich, Investigation of a system of nonlinear equations simulating DNA torsional dynamics, Stud. Biophys. 140 (1991), 163-170 [54] L.V. Yakushevich, Nonlinear DNA dynamics: hyerarchy of the models, Physica D 79 (1994), 77-86 [55] L.V. Yakushevich, DNA as a nonlinear dynamical system, Macromol. Symp. 160 (2000), 61-68 [56] L.V. Yakushevich, Nonlinear Physics of DNA, Wiley (Chichester) 1998; second edition 2004 [57] L.V. Yakushevich, A.V. Savin and L.I. Manevitch, Nonlinear dynamics of topological solitons in DNA, Phys. Rev. E 66 (2002), 016614 [58] S. Yomosa, Soliton excitations in deoxyribonucleic acid (DNA) double helices, Phys. Rev. A 27 (1983), 2120-2125; Solitary excitations in deoxyribonucleic acid (DNA) double helices, Phys. Rev. A 30 (1984), 474-480 [59] L.L. van Zandt, DNA soliton realistic parameters, Phys. Rev. A 40 (1989), 6134-6137 [60] G. Weber, Sharp DNA denaturation due to solvent interaction, Europhys. Lett. 73 (2006), 806-811 [61] Ch.T. Zhang, Soliton excitations in deoxyribonucleic acid (DNA) double helices, Phys. Rev. A 35 (1987), 886-891; Harmonic and subharmonic resonances of microwave absorption in DNA, Phys. Rev. A 40 (1989), 40-45 [62] F. Zhang and M.A. Collins, Model simulations of DNA dynamics, Phys. Rev. E 52 (1995), 4217-4224 [63] It should be stressed that when we speak of mechanical models of DNA we exclude consideration of the all-important interactions between DNA and its environment. The latter includes at least the ?uid in which DNA is immersed, and interaction with this leads to energy exchanges; one should thus include in the equations describing DNA dynamics both dissipation terms and random terms due to interaction with molecules in the ?uid. We will work here at a purely mechanical level, i.e. do not consider at present these e?ects. Moreover, it should be mentioned that even forgetting dissipative and brownian motion e?ects, one could consider interaction with the solvent by including e?ective terms in the intrapair potential Vp (see below), as done e.g. in [62]; it has been recently shown that in the context of the Peyrard-Bishop model this leads to a sharpening of certain transitions [60]. [64] It is appropriate, in this context, to mention earlier models proposed by Fedyanin, Gochev and Lisy [18], Muto et al. [34], Prohofsky [44], Takeno and Homma [50], van Zandt [59], Yomosa [58], and Zhang [61]. [65] It is usual, for ease of language, to refer to models in which helicoidal interactions are taken into account as helicoidal, and to models in which they are overlooked as planar. Needless to say, the geometry of the model is the same in both cases. [66] This assumption (see also section 9) is common to all the mathematical C as opposed to physico-chemical C models of DNA, and as already remarked is necessary to be able to perform an analytical study of the model. We refer e.g. to [39, 56] for discussion about this point. Study of real sequences, i.e. with di?erent characteristics for di?erent bases, is possible numerically; see e.g. [47, 62]. [67] A more realistic choice could have important consequences of qualitative C and not just quantitative C behavior of nonlinear excitations. This point is discussed in [45]. [68] The stacking interactions are of essentially electrostatic nature; thus it is reasonable in this context to see the bases as dipoles. If we have two identical dipoles made of charges a distance d apart, their separation vector being along the z direction, and force them to move in parallel planes orthogonal to the z axis and a distance L apart, then denoting with the distance of their projections in the (x, y) plane we have V () = (1/2)((d + L)?3 ? 2L?3 + (d ? L)?3 ) 2 ? (3/8)((d + L)?5 ? 2L?5 + (d ? L)?5 ) 4 + O( 6 ) . [69] The interaction does more properly depend on the degree of superposition of projections to bases on the plane orthogonal to the double helix axis (and moreover depends on the details of charge distribution on each base), and in particular quickly goes to zero once the bases assume di?erent positions. Moreover, once the bases extrude from the double helix there are ionic interactions between the bases and the solvent which should be taken into account [62]. In this note, however, we will just consider harmonic stacking. [70] In physical terms, this is not obvious by itself, as the ?laments could have to wind around the double helix if the two connected nucleosides are twisted by 2 with respect to each other; however, when this happens the ?laments are actually broken and then built again thanks to quantum ?uctuations. [71] There are indeed qualitative arguments suggesting a degeneration of minimizing con?guration whenever the topological numbers are not (1,0) or (0,1); we will not discuss these arguments nor the matter here. [72] In order to avoid any confusion, we stress that here the distance between bases refers e.g. to the NCH distance in a NCHCO hydrogen bond; the total length of the bond is about 3?, and often one refers to this as the interbase distance. A A Here instead we consider the H atom C which lies at about 1? from the nearer atom in the H bond C as part of one of the bases and hence consider the distance of it from the other atom as the interbase distance.