Search This Blog

Thursday, January 12, 2017

End of Hardware Support for Exadata


As we know, the End of Support becomes after 5 years after the End of Manufacturing (End of Life in Oracle terminology).

So here is the table of EOS for Exadatas:


https://support.oracle.com/handbook_partner/Systems/eolSystemList.html







Wednesday, January 11, 2017

How to measure IB latency and bandwidth test

Testing the IB network between 2 nodes.
The test was run on X4-2 Exadata.

I. Measuring Latency


On the 2nd node run the listener program and go tot the 1st node:
[root@ed04dbadm02 ~]# ib_read_lat

On the 1st node run the test and see the report:

[root@ed04dbadm01 ~]# ib_read_lat -d mlx4_0 ed04dbadm02
------------------------------------------------------------------
                    RDMA_Read Latency Test
 Number of qps   : 1
 Connection type : RC
 TX depth        : 50
 Mtu             : 2048B
 Link type       : IB
 Outstand reads  : 16
 rdma_cm QPs     : OFF
 Data ex. method : Ethernet
------------------------------------------------------------------
 local address: LID 0x0c QPN 0x0088 PSN 0x41d82e OUT 0x10 RKey 0x1f31e00 VAddr 0x000000021b5000
 remote address: LID 0x0d QPN 0x006e PSN 0x3b5a1e OUT 0x10 RKey 0x2661800 VAddr 0x00000001431000
------------------------------------------------------------------
 #bytes #iterations    t_min[usec]    t_max[usec]  t_typical[usec]
 2       1000          1.77           14.12        1.82  
------------------------------------------------------------------



 

And another run:
[root@ed04dbadm01 ~]# ib_read_lat -a -d mlx4_0 ed04dbadm02
------------------------------------------------------------------
                    RDMA_Read Latency Test
 Number of qps   : 1
 Connection type : RC
 TX depth        : 50
 Mtu             : 2048B
 Link type       : IB
 Outstand reads  : 16
 rdma_cm QPs     : OFF
 Data ex. method : Ethernet
------------------------------------------------------------------
 local address: LID 0x0c QPN 0x008a PSN 0x266b05 OUT 0x10 RKey 0x1f5f800 VAddr 0x007f7a60362000
 remote address: LID 0x0d QPN 0x0071 PSN 0xc9915e OUT 0x10 RKey 0x268c100 VAddr 0x000000011ce000
------------------------------------------------------------------
 #bytes #iterations    t_min[usec]    t_max[usec]  t_typical[usec]
 2       1000          2.34           20.83        2.88  
 4       1000          2.52           37.00        2.87  
 8       1000          2.53           34.74        2.88  
 16      1000          2.33           20.61        2.78  
 32      1000          2.36           16.59        2.87  
 64      1000          2.54           17.71        2.89  
 128     1000          2.59           30.65        2.95  
 256     1000          2.56           20.28        3.16  
 512     1000          2.46           25.53        3.45  
 1024    1000          3.23           30.31        3.75  
 2048    1000          4.21           28.89        4.56  
 4096    1000          4.82           20.08        5.40  
 Completion with error at client
 Failed status 10: wr_id 1 syndrom 0x88
scnt=1, ccnt=1
[root@ed04dbadm01 ~]#


And see the report from 2nd node:

[root@ed04dbadm02 ~]# ib_read_lat
------------------------------------------------------------------
                    RDMA_Read Latency Test
 Number of qps   : 1
 Connection type : RC
 Mtu             : 2048B
 Link type       : IB
 Outstand reads  : 16
 rdma_cm QPs     : OFF
 Data ex. method : Ethernet
------------------------------------------------------------------
 local address: LID 0x0d QPN 0x006e PSN 0x3b5a1e OUT 0x10 RKey 0x2661800 VAddr 0x00000001431000
 remote address: LID 0x0c QPN 0x0088 PSN 0x41d82e OUT 0x10 RKey 0x1f31e00 VAddr 0x000000021b5000
-----------------------------------------------------------------

II. Bandwidth


 

Next stage is the bandwidth test.

On 2nd node we run the listener:

[root@ed04dbadm02 ~]# ib_read_bw

On 1st node we run the test itself:

[root@ed04dbadm01 ~]# ib_read_bw -a -d mlx4_0 ed04dbadm02
------------------------------------------------------------------
                    RDMA_Read BW Test
 Number of qps   : 1
 Connection type : RC
 TX depth        : 300
 CQ Moderation   : 50
 Mtu             : 2048B
 Link type       : IB
 Outstand reads  : 16
 rdma_cm QPs     : OFF
 Data ex. method : Ethernet
------------------------------------------------------------------
 local address: LID 0x0c QPN 0x0089 PSN 0x4f763c OUT 0x10 RKey 0x1f50900 VAddr 0x007fb36137f000
 remote address: LID 0x0d QPN 0x0070 PSN 0x418495 OUT 0x10 RKey 0x267f600 VAddr 0x007fe31d2ab000
------------------------------------------------------------------
 #bytes     #iterations    BW peak[MB/sec]    BW average[MB/sec]
 2          1000           9.51               9.41  
 4          1000           20.29              20.21 
 8          1000           40.42              40.25 
 16         1000           80.85              78.35 
 32         1000           161.99             161.52
 64         1000           309.76             308.93
 128        1000           650.33             648.20
 256        1000           1291.16            1286.98
 512        1000           2610.95            2523.90
 1024       1000           3074.05            3070.13
 2048       1000           3192.15            3159.41
 4096       1000           3220.35            3219.55
 8192       1000           3230.01            3229.81
 16384      1000           3241.36            3241.14
 32768      1000           3245.32            3245.29
 65536      1000           3248.53            3248.47
 Completion with error at client
 Failed status 10: wr_id 1 syndrom 0x88
scnt=300, ccnt=0


[root@ed04dbadm02 ~]# ib_read_bw
------------------------------------------------------------------
                    RDMA_Read BW Test
 Number of qps   : 1
 Connection type : RC
 CQ Moderation   : 50
 Mtu             : 2048B
 Link type       : IB
 Outstand reads  : 16
 rdma_cm QPs     : OFF
 Data ex. method : Ethernet
------------------------------------------------------------------
 local address: LID 0x0d QPN 0x0070 PSN 0x418495 OUT 0x10 RKey 0x267f600 VAddr 0x007fe31d2ab000

 remote address: LID 0x0c QPN 0x0089 PSN 0x4f763c OUT 0x10 RKey 0x1f50900 VAddr 0x007fb36137f000
pp_read_keys: Success
Couldn't read remote address
 Unable to write to socket/rdam_cm
Failed to close connection between server and client
[root@ed04dbadm02 ~]#


And bi-directional test:
[root@ed04dbadm01 ~]# ib_read_bw -a -b -d mlx4_0 ed04dbadm02
------------------------------------------------------------------
                    RDMA_Read Bidirectional BW Test
 Number of qps   : 1
 Connection type : RC
 TX depth        : 300
 CQ Moderation   : 50
 Mtu             : 2048B
 Link type       : IB
 Outstand reads  : 16
 rdma_cm QPs     : OFF
 Data ex. method : Ethernet
------------------------------------------------------------------
 local address: LID 0x0c QPN 0x0091 PSN 0x71081f OUT 0x10 RKey 0x1f94b00 VAddr 0x007f851c57d000
 remote address: LID 0x0d QPN 0x0076 PSN 0xae8412 OUT 0x10 RKey 0x26bf600 VAddr 0x007fc7ed0af000
------------------------------------------------------------------
 #bytes     #iterations    BW peak[MB/sec]    BW average[MB/sec]
 2          1000           20.47              20.09 
 4          1000           40.95              40.82 
 8          1000           81.74              81.41 
 16         1000           164.40             163.81
 32         1000           328.80             327.14
 64         1000           658.83             656.19
 128        1000           1320.12            1315.04
 256        1000           2482.47            2459.42
 512        1000           5260.80            5244.75
 1024       1000           6250.11            6232.63
 2048       1000           6395.13            6386.76
 4096       1000           6466.50            6466.02
 8192       1000           6459.10            6449.20
 16384      1000           6490.17            6489.88
 32768      1000           6503.01            6502.80
 65536      1000           6499.86            6499.85
 Completion with error at client
 Failed status 10: wr_id 1 syndrom 0x88
scnt=300, ccnt=0
[root@ed04dbadm01 ~]#


III. HELP



[root@ed04dbadm01 ~]# ib_read_lat -h
Usage:
  ib_read_lat            start a server and wait for connection
  ib_read_lat <host>     connect to server at <host>

Options:
  -p, --port=<port>  Listen on/connect to port <port> (default 18515)
  -d, --ib-dev=<dev>  Use IB device <dev> (default first device found)
  -R, --rdma_cm  Connect QPs with rdma_cm and run test on those QPs
  -z, --com_rdma_cm  Communicate with rdma_cm module to exchange data - use regular QPs
  -i, --ib-port=<port>  Use port <port> of IB device (default 1)
  -c, --connection=<RC/UC/UD>  Connection type RC/UC/UD (default RC)
  -m, --mtu=<mtu>  Mtu size : 256 - 4096 (default port mtu)
  -s, --size=<size>  Size of message to exchange (default 2)
  -a, --all  Run sizes from 2 till 2^23
  -n, --iters=<iters>  Number of exchanges (at least 2, default 1000)
  -t, --tx-depth=<dep>  Size of tx queue (default 50)
  -u, --qp-timeout=<timeout>  QP timeout, timeout value is 4 usec * 2 ^(timeout), default 14
  -S, --sl=<sl>  SL (default 0)
  -x, --gid-index=<index>  Test uses GID with GID index (Default : IB - no gid . ETH - 0)
  -F, --CPU-freq  Do not fail even if cpufreq_ondemand module is loaded
  -V, --version  Display version number
  -I, --inline_size=<size>  Max size of message to be sent in inline (default 400)
  -e, --events  Sleep on CQ events (default poll)
  -o, --outs=<num>  num of outstanding read/atom(default max of device)
  -C, --report-cycles  report times in cpu cycle units (default microseconds)
  -H, --report-histogram  Print out all results (default print summary only)
  -U, --report-unsorted  (implies -H) print out unsorted results (default sorted)



[root@ed04dbadm01 ~]# ib_read_bw -h
Usage:
  ib_read_bw            start a server and wait for connection
  ib_read_bw <host>     connect to server at <host>

Options:
  -p, --port=<port>  Listen on/connect to port <port> (default 18515)
  -d, --ib-dev=<dev>  Use IB device <dev> (default first device found)
  -R, --rdma_cm  Connect QPs with rdma_cm and run test on those QPs
  -z, --com_rdma_cm  Communicate with rdma_cm module to exchange data - use regular QPs
  -i, --ib-port=<port>  Use port <port> of IB device (default 1)
  -c, --connection=<RC/UC/UD>  Connection type RC/UC/UD (default RC)
  -m, --mtu=<mtu>  Mtu size : 256 - 4096 (default port mtu)
  -s, --size=<size>  Size of message to exchange (default 65536)
  -a, --all  Run sizes from 2 till 2^23
  -n, --iters=<iters>  Number of exchanges (at least 2, default 1000)
  -t, --tx-depth=<dep>  Size of tx queue (default 300)
  -u, --qp-timeout=<timeout>  QP timeout, timeout value is 4 usec * 2 ^(timeout), default 14
  -S, --sl=<sl>  SL (default 0)
  -x, --gid-index=<index>  Test uses GID with GID index (Default : IB - no gid . ETH - 0)
  -F, --CPU-freq  Do not fail even if cpufreq_ondemand module is loaded
  -V, --version  Display version number
  -I, --inline_size=<size>  Max size of message to be sent in inline (default 0)
  -b, --bidirectional  Measure bidirectional bandwidth (default unidirectional)
  -Q, --cq-mod  Generate Cqe only after <--cq-mod> completion
  -e, --events  Sleep on CQ events (default poll)
  -N, --no peak-bw  Cancel peak-bw calculation (default with peak)
  -o, --outs=<num>  num of outstanding read/atom(default max of device)

Tuesday, January 10, 2017

ORA-01555 after unclear shutdown and How to repair a CONFLUENCE ?

After holidays at NY 2017 the CONFLUENCE won't work.
In its logs the application say something about ORA-01555 and alert show the unclear shutdown.

There were no bad messages in database alert logs and any corrupted blocks, however.

To check the data integrity I run the full export of database. And Full export show the bad table:

ORA-31693: Table data object "CONFLU"."BANDANA" failed to load/unload and is being skipped due to error:
ORA-02354: error in exporting/importing data
ORA-01555: snapshot too old: rollback segment number  with name "" too small
ORA-22924: snapshot too old

The command

set long 9999999
select * from conflu.bandana ;
has finished successfully and show all 297 rows as if there is no error !


But the export raised the error ORA-01555.

The solution is:

create table corrupted_lob_data (corrupted_rowid rowid);
declare
error_1578 exception;
error_1555 exception;
error_22922 exception;
pragma exception_init(error_1578,-1578);
pragma exception_init(error_1555,-1555);
pragma exception_init(error_22922,-22922);
n number;
begin
  for cursor_lob in (select rowid r, bandanavalue from conflu.bandana) loop
    begin
      n := dbms_lob.instr (cursor_lob.bandanavalue, hextoraw ('889911')) ;
exception
when error_1578 then
insert into corrupted_lob_data values (cursor_lob.r, 1578);
commit;
when error_1555 then
insert into corrupted_lob_data values (cursor_lob.r, 1555);
commit;
when error_22922 then
insert into corrupted_lob_data values (cursor_lob.r, 22922);
commit;
end;
end loop;
end;
/


This script checks any LOB data in the corrupted table.

Then I found a specific bad row:   select * from conflu.bandana where rowid='AAAUnkAAFAAABiEAAG';


And specific filed "bandanavalue" in this row.
Then I updated this field:

update conflu.bandana set   bandanavalue = empty_clob() where rowid='AAAUnkAAFAAABiEAAG';
commit;

Now Confluence works!

Thursday, December 8, 2016

12-ТБ и 14-ТБ hard disks

Western Digital introduced 12T and 14T hard disks.

https://www.wdc.com/about-wd/newsroom/press-room/2016-12-06-western-digital-introduces-advanced-devices-to-manage-evolving-data-center-application-demands.html

Новые технологии в производстве жестких дисков до 2020 года:
http://fcenter.ru/online/hardnews/#material_id=39308

Western Digital: Мы не разрабатываем новые 10K и 15K жёсткие диски:
https://3dnews.ru/945263?from=most-commented-index

Thursday, May 19, 2016

EXAFUSION_ENABLED = 1 (def=0 )

ExaFusion is Direct-to-Wire protocol which allows DATABASE processes send messages directly over the Infiniband network bypassing  the OS kernel. This means database processes make no syscal to OS. Database process directly write data to IB HCA bypassing normal networking software stack. This is the one of reasons why IB improves the response time in Oracle Exadata Database Machine. Data is transferred directly from user space of one process on one node to user space to other process on other node (no context switch to kernel space).

But, ExaFusion is disabled by default !
To enable ExaFusion look to the documentation:

http://docs.oracle.com/database/121/REFRN/GUID-A0612249-0A65-476D-A2B8-FA94DF5EEC7F.htm#REFRN-GUID-A0612249-0A65-476D-A2B8-FA94DF5EEC7F

You should to set EXAFUSION_ENABLED=1 in the database parameter file.

Tuesday, April 19, 2016

X6-2 1/8 new features: half disks and half flash removed

As we know in previous editions 1/8 Exadata = 1/4 Exadata by HW.
This means that storage servers had a 12 hard disks and 4 flash cards, but 6 hard disks and 2 flash cards were disabled. But they still were installed in the server.

In the X6-2 we have a new feature from Oracle:  "Eighth Rack HC storage servers have half the cores enabled and half the disks and flash cards removed."

This means that  X6-2 1/8 contain 6 hard disks and 2 flash cards.

The EF edition has the half flash cards disabled, but installed in the storage server.



http://www.oracle.com/technetwork/database/exadata/exadata-x6-2-ds-2968790.pdf