SQL Server Linux Container running as StatefulSet shows status as Recovery Pending
Is this a request for help?:
YES
Is this a BUG REPORT or FEATURE REQUEST? (choose one):
BUG REPORT
Orchestrator and version (e.g. Kubernetes, DC/OS, Swarm)
Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.0", GitCommit:"d3ada0119e776222f11ec7945e6d860061339aad", GitTreeState:"clean", BuildDate:"2017-06-29T23:15:59Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"windows/amd64"}
Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.6", GitCommit:"7fa1c1756d8bc963f1a389f4a6937dc71f08ada2", GitTreeState:"clean", BuildDate:"2017-06-16T18:21:54Z", GoVersion:"go1.7.6", Compiler:"gc", Platform:"linux/amd64"}
What happened: I created a custom SQL Server Linux container by following process as described below.
- Create a standard SQL Sever Container by running
docker run -d -p 1433:1433 -e "SA_PASSWORD=<password>" -e "ACCEPT_EULA=Y" microsoft/mssql-server-linux - Connect to this container on 127.0.0.1, 1433 from SQL Server Management Studio and create a custom database by running following script
GO
IF(db_id(N'snoopyshoppingcart') IS NULL)
BEGIN
CREATE DATABASE snoopyshoppingcart
ON
(
NAME = ssc_dat,
FILENAME = N'/ssc/ssc.mdf'
)
LOG ON
(
NAME = ssc_log,
FILENAME = N'/ssc/ssc.ldf'
)
END;
GO
USE snoopyshoppingcart
GO
IF NOT EXISTS (SELECT * FROM sysobjects WHERE NAME='shopping' AND XTYPE='U')
BEGIN
CREATE TABLE shopping
(
AddedOn datetime,
ConnectionID nvarchar(100),
IP nvarchar(20),
CartItem nvarchar(100)
)
END
Note that database files are created on path /ssc. This is a deviation from standard SQL Server files path /var/opt/mssql/data. I want to use /ssc - custom path to use as mountpoint and mount it on Azure Disk when creating a statefulset for this container.
3. Run SHUTDOWN WITH NOWAIT.
4. Stop SQL Server Container.
5. Commit changes by running docker commit 8f maksh/snoopyshoppingcartdb and creating a new image.
6. Push this custom image by running docker push maksh/snoopyshoppingcartdb
When I create a container from this custom image and run locally (127.0.0.1, 1433), I can see that my custom database is available.
However, when I run this container as a statefulset on Kubernetes in Azure. Database shows status as Recovery Pending. My k8s manifests below.
Secret (For sa password):
apiVersion: v1
kind: Secret
metadata:
name: sqlsecret
type: Opaque
data:
sapassword: UGFzc3dvcmQxMjM0
Storage Class:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: azurestorageclass
provisioner: kubernetes.io/azure-disk
parameters:
skuName: Standard_LRS
location: southeastasia
storageAccount: <my-storage-account-in-same-rg-as-k8s>
Service:
apiVersion: v1
kind: Service
metadata:
name: sqlservice
labels:
app: sqlservice
spec:
type: LoadBalancer
ports:
- port: 1433
targetPort: 1433
selector:
app: sqlinux
StatefulSet:
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: sqlserverstatefulset
spec:
serviceName: "sqlservice"
replicas: 1
template:
metadata:
labels:
app: sqlinux
spec:
terminationGracePeriodSeconds: 10
containers:
- name: sqlinux
image: maksh/snoopyshoppingcartdb
env:
- name: SA_PASSWORD
valueFrom:
secretKeyRef:
name: sqlsecret
key: sapassword
- name: ACCEPT_EULA
value: "Y"
ports:
- containerPort: 1433
volumeMounts:
- name: sql-persistent-storage
mountPath: "/ssc"
volumeClaimTemplates:
- metadata:
name: sql-persistent-storage
annotations:
volume.beta.kubernetes.io/storage-class: "azurestorageclass"
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 16Gi
Pod logs following message: This message appears only on Kubernetes. Local container doesn't log this message.
2017-08-14 07:20:48.76 spid24s Starting up database 'snoopyshoppingcart'.
2017-08-14 07:20:48.77 spid24s Error: 17204, Severity: 16, State: 1.
2017-08-14 07:20:48.77 spid24s FCB::Open failed: Could not open file /ssc/ssc.mdf for file number 1. OS error: 2(The system cannot find the file specified.).
2017-08-14 07:20:48.78 spid9s The resource database build version is 14.00.800. This is an informational message only. No user action is required.
2017-08-14 07:20:48.79 spid24s Error: 5120, Severity: 16, State: 101.
2017-08-14 07:20:48.79 spid24s Unable to open the physical file "/ssc/ssc.mdf". Operating system error 2: "2(The system cannot find the file specified.)".
2017-08-14 07:20:48.81 spid24s Error: 17207, Severity: 16, State: 1.
2017-08-14 07:20:48.81 spid24s FileMgr::StartLogFiles: Operating system error 2(The system cannot find the file specified.) occurred while creating or opening file '/ssc/ssc.ldf'. Diagnose and correct the operating system error, and retry the operation.
2017-08-14 07:20:48.81 spid9s Starting up database 'model'.
2017-08-14 07:20:48.83 spid24s File activation failure. The physical file name "/ssc/ssc.ldf" may be incorrect.
When I connect with SQL Server using external IP of service, I see my custom database is shown with status Recovery Pending
What you expected to happen: I expect custom SQL Server database should be mounted on Azure Disk and in operational state.
How to reproduce it (as minimally and precisely as possible): Follow steps as mentioned above.
Anything else we need to know:
A VHD disk gets created in storage account. This matches persistentvolumeclaim .
This is probably the NFS/remote storage problem that is present in RC1 and RC2 images. To confirm can you please pull/run the image with the :ctp2-1 tag? The bug was not present in that release. This issue is fixed in development now and will be part of the next release after RC2.
Created new image from microsoft/mssql-server-linux:ctp2-1 base image. Same results :-( Database still shows "Recovery Pending" status.
Also tried with following another approach.
Dockerfile:
FROM microsoft/mssql-server-linux:ctp2-1
# Create app directory
RUN mkdir -p /ssc
WORKDIR /ssc
# Bundle app source
COPY . /ssc
# Grant permissions for the import-data script to be executable
RUN chmod +x /ssc/setup.sh
CMD /bin/bash /ssc/entrypoint.sh
Setup.sh:
#wait for the SQL Server to come up
sleep 90s
#run the setup script to create the DB and the schema in the DB
/opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P <my-password> -d master -i setup.sql
Setup.sql:
GO
IF(db_id(N'snoopyshoppingcart') IS NULL)
BEGIN
CREATE DATABASE snoopyshoppingcart
ON
(
NAME = ssc_dat,
FILENAME = N'/ssc/ssc.mdf'
)
LOG ON
(
NAME = ssc_log,
FILENAME = N'/ssc/ssc.ldf'
)
END;
GO
USE snoopyshoppingcart
GO
IF NOT EXISTS (SELECT * FROM sysobjects WHERE NAME='shopping' AND XTYPE='U')
BEGIN
CREATE TABLE shopping
(
AddedOn datetime,
ConnectionID nvarchar(100),
IP nvarchar(20),
CartItem nvarchar(100)
)
END
entrypoint.ssh:
#start SQL Server, start the script to create the DB
/opt/mssql/bin/sqlservr & /ssc/setup.sh
Docker logs <contID> shows following error.
2017-08-15 05:05:20.46 spid52 Starting up database 'snoopyshoppingcart'.
2017-08-15 05:05:20.55 spid52 Parallel redo is started for database 'snoopyshoppingcart' with worker pool size [1].
2017-08-15 05:05:20.57 spid52 Parallel redo is shutdown for database 'snoopyshoppingcart' with worker pool size [1].
Changed database context to 'snoopyshoppingcart'.
After this error, container becomes unresponsive.
It looks you placed ssc.mdf, ssc.ldf to directory /ssc in the container image. It is working fine in local machine as there is no data disk attached to it. Then when you create a POD with a persistent volume, the volume /dev/sdc is mounted to directory (mountpoint) /ssc and it does not have the file ssc.mdf or ssc.ldf.
For instance, a new mountpoint /mntdavid, when you write files into it, the file “davidtestabcd20170815” is saved on OS disk /dev/sda1.
root@davidubuntu1604lts:/mntdavid# ls -la total 815672 drwxr-xr-x 2 root root 4096 Aug 15 05:03 . drwxr-xr-x 24 root root 4096 May 18 05:11 .. -rw-r--r-- 1 root root 835232768 Aug 15 05:04 davidtestabcd20170815 root@davidubuntu1604lts:/mntdavid# df -k Filesystem 1K-blocks Used Available Use% Mounted on udev 1740984 0 1740984 0% /dev tmpfs 352176 36220 315956 11% /run /dev/sda1 30428648 8283944 22128320 28% / tmpfs 1760860 0 1760860 0% /dev/shm tmpfs 5120 0 5120 0% /run/lock tmpfs 1760860 0 1760860 0% /sys/fs/cgroup tmpfs 352176 0 352176 0% /run/user/0 tmpfs 352176 0 352176 0% /run/user/999 /dev/sdb1 7092664 16120 6693216 1% /mnt tmpfs 352176 0 352176 0% /run/user/1000
If the mountpoint is mounted with a data disk (for instance /dev/sdc1), it will only show the contents of /dev/sdc1 and you will not be able to see the files from /dev/sda1.
root@davidubuntu1604lts:/mntdavid# mount /dev/sdc1 /mntdavid root@davidubuntu1604lts:/mntdavid# ls -la total 116 drwxr-xr-x 23 root root 4096 May 17 23:58 . drwxr-xr-x 24 root root 4096 May 18 05:11 .. drwxr-xr-x 2 root root 4096 May 17 23:54 bin drwxr-xr-x 3 root root 4096 May 17 23:58 boot drwxr-xr-x 5 root root 4096 Jun 27 2016 dev drwxr-xr-x 99 root root 4096 May 18 06:55 etc drwxr-xr-x 3 root root 4096 May 17 11:05 home lrwxrwxrwx 1 root root 32 May 17 23:58 initrd.img -> boot/initrd.img-4.4.0-78-generic lrwxrwxrwx 1 root root 32 Jun 27 2016 initrd.img.old -> boot/initrd.img-4.4.0-28-generic root@davidubuntu1604lts:/mntdavid# ls -la | grep -i davidtestabcd20170815
Is there a way to set a default directory as an environmental value for SQLonLinux instead of placing any data to it?
FWIW, I have a SQL Linux container running on ACS and I used the default directory as below.
volumeMounts:
- mountPath: /var/opt/mssql
name: sqldb-home
@chzbrgr71 can you share your k8s manifests/yaml?
I have it posted here: https://github.com/chzbrgr71/sql-guestbook/blob/master/kube-db.yaml
Thanks @chzbrgr71. I see that you are using base image microsoft/mssql-server-linux to create container and then run a script to create custom database.
In my case, I have a custom database in custom database image.
@twright-msft Do you think Dockerfile for my custom database is correct?
The only thing about putting your database in the image is that it is ephemeral. Any changes you make to the data will be lost if the container is deleted. For example, when you want to upgrade the SQL Server binaries you will need to deploy a new container. That may be fine in some scenarios like for dev/test purposes if that is what you are using it for. Otherwise, you'll want to do something more like what @chzbrgr71 has done.
Thats why I am using statefulset with persistent volume.
Even if I do not put my database as part of image and try to create it using db creation script at container startup, it is failing.
I am simply trying to run db creation script at container start up - by running entrypoint.ssh -(@chzbrgr71 is running his script manually!).
Is this not valid approach?
My planned approach for this is to use a Kubernetes initContainer. This would allow for a .sql script to be run to pre-stage the database and any seed data. Working on a sample here: https://github.com/chzbrgr71/guestbook-web/blob/master/sql-initcontainer/db.yaml
Still just a draft, but I see this working well. Need another initContainer to wait for SQL Server to start as well.
If you are trying to use remote storage for this you are hitting the RC1/RC2 bug that prevents DBs from being created on NFS/Samba shares. Only local storage will work for now. This is fixed already and will be in the next release. Until then you can go back to using the :CTP1-4 tagged image which doesnt have this bug.