[go: up one dir, main page]

US20130159622A1 - Chained, scalable storage devices - Google Patents

Chained, scalable storage devices Download PDF

Info

Publication number
US20130159622A1
US20130159622A1 US13/765,253 US201313765253A US2013159622A1 US 20130159622 A1 US20130159622 A1 US 20130159622A1 US 201313765253 A US201313765253 A US 201313765253A US 2013159622 A1 US2013159622 A1 US 2013159622A1
Authority
US
United States
Prior art keywords
storage devices
host
storage
network
primary agent
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US13/765,253
Inventor
Earl T. Cohen
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Seagate Technology LLC
Original Assignee
LSI Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Assigned to LSI CORPORATION reassignment LSI CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: COHEN, EARL T.
Priority to US13/765,253 priority Critical patent/US20130159622A1/en
Application filed by LSI Corp filed Critical LSI Corp
Publication of US20130159622A1 publication Critical patent/US20130159622A1/en
Priority to TW103102356A priority patent/TWI614670B/en
Priority to JP2014019112A priority patent/JP2014154157A/en
Priority to EP14153954.4A priority patent/EP2765501A1/en
Priority to KR1020140014704A priority patent/KR102171716B1/en
Priority to CN201410047792.8A priority patent/CN103984638A/en
Assigned to DEUTSCHE BANK AG NEW YORK BRANCH, AS COLLATERAL AGENT reassignment DEUTSCHE BANK AG NEW YORK BRANCH, AS COLLATERAL AGENT PATENT SECURITY AGREEMENT Assignors: AGERE SYSTEMS LLC, LSI CORPORATION
Assigned to AGERE SYSTEMS LLC, LSI CORPORATION reassignment AGERE SYSTEMS LLC TERMINATION AND RELEASE OF SECURITY INTEREST IN CERTAIN PATENTS INCLUDED IN SECURITY INTEREST PREVIOUSLY RECORDED AT REEL/FRAME (032856/0031) Assignors: DEUTSCHE BANK AG NEW YORK BRANCH, AS COLLATERAL AGENT
Assigned to SEAGATE TECHNOLOGY LLC reassignment SEAGATE TECHNOLOGY LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: LSI CORPORATION
Assigned to AGERE SYSTEMS LLC, LSI CORPORATION reassignment AGERE SYSTEMS LLC TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENT RIGHTS (RELEASES RF 032856-0031) Assignors: DEUTSCHE BANK AG NEW YORK BRANCH, AS COLLATERAL AGENT
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/14Handling requests for interconnection or transfer
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/40Bus structure
    • G06F13/4004Coupling between buses
    • G06F13/4022Coupling between buses using switching circuits, e.g. switching matrix, connection or expansion network
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0626Reducing size or complexity of storage systems
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0631Configuration or reconfiguration of storage systems by allocating resources to storage systems
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0659Command handling arrangements, e.g. command buffers, queues, command scheduling
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0662Virtualisation aspects
    • G06F3/0664Virtualisation aspects at device level, e.g. emulation of a storage device or system
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0688Non-volatile semiconductor memory arrays
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2213/00Indexing scheme relating to interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F2213/0026PCI express

Definitions

  • a Storage Area Network is a system that provides access to consolidated, block-level storage, such as disk arrays and tape libraries, to one or more host devices coupled to the SAN.
  • a SAN represents a plurality of storage devices as a single logical interface to the host devices, conceptually aggregating the storage implemented by each of the storage devices into a single logical storage space.
  • a typical SAN might be scalable, meaning that the amount of storage space (e.g., the number of storage devices) can be changed as needed in different SAN systems.
  • a SAN provides block-level access, meaning that the file system is typically managed by the host devices.
  • a typical SAN might employ block-level protocols such as Fibre Channel (FC), Advanced Technology Attachment (ATA) over Ethernet (AoE), Internet Small Computer System Interface (iSCSI) or HyperSCSI.
  • FC Fibre Channel
  • ATA Advanced Technology Attachment
  • AoE Internet Small Computer System Interface
  • iSCSI Internet Small Computer System Interface
  • HyperSCSI HyperSCSI
  • a Network Attached Storage is a system that provides file-level access to one or more host devices coupled to the NAS. Unlike a SAN, the NAS system provides a file system for its attached storage devices, essentially acting as a file server accessing one or more local block-level storage devices.
  • a typical NAS might employ file-level protocols such as Network File System (NFS) or Server Message Block/Common Internet File System (SMB/CIFS).
  • NFS Network File System
  • SMB/CIFS Server Message Block/Common Internet File System
  • a SAN-NAS hybrid system is a system that provides hosts with both file-level access like a NAS device and block-level access like a SAN system from the same storage system.
  • a primary agent of one or more storage devices receives a host request including a logical address from a host coupled to the primary agent.
  • the primary agent determines, based on the logical address, a corresponding physical address in at least one of the storage devices and generates, based on the physical address, a sub-request for each determined physical address in the storage devices.
  • the primary agent sends, via a storage device interface network operable independently of the host, the sub-requests to the storage devices.
  • the storage device interface network is a peer-to-peer network coupling the storage devices to the primary agent.
  • the primary agent receives sub-statuses in response to the sub-requests, and determines an overall status.
  • the primary agent provides the overall status to the host such that the host is coupled to the storage devices without a switch.
  • FIG. 1 shows a block diagram of a scalable storage system in accordance with exemplary embodiments
  • FIG. 2 shows a block diagram of a scalable storage system in accordance with exemplary embodiments
  • FIG. 3 shows a block diagram of a scalable storage system in accordance with exemplary embodiments
  • FIG. 4 shows a block diagram of a scalable storage system in accordance with exemplary embodiments.
  • FIG. 5 shows a block diagram of a scalable storage system in accordance with exemplary embodiments.
  • a primary agent of one or more storage devices receives a host request including a logical address from a host coupled to the primary agent.
  • the primary agent determines, based on the logical address, a corresponding physical address in at least one of the storage devices and generates, based on the physical address, a sub-request for each determined physical address in the storage devices.
  • the primary agent sends, via a storage device interface network operable independently of the host, the sub-requests to the storage devices.
  • the storage device interface network is a peer-to-peer network coupling the storage devices to the primary agent.
  • the primary agent receives sub-statuses in response to the sub-requests, and determines an overall status.
  • the primary agent provides the overall status to the host such that the host is coupled to the storage devices without a switch.
  • the storage devices might have a primary agent of the devices accept storage requests received from host devices over a host-interface (HIF) protocol.
  • the primary agent processes the host requests and generates one or more sub-requests to secondary agents of each storage device over a peer-to-peer protocol.
  • the secondary agents accept and process the sub-requests, and report sub-status information for each of the sub-requests to the primary agent and/or the host.
  • the primary agent optionally accumulates the sub-statuses into an overall status of the host request.
  • Peer-to-peer communication between the agents is optionally used to communicate redundancy information during host accesses and/or failure recoveries.
  • Various failure recovery techniques might reallocate storage, reassign agents and recover data via redundancy information.
  • FIG. 1 shows a block diagram of an exemplary scalable storage system, for example as described in related U.S. patent application Ser. No. 13/702,976, filed Dec. 7, 2012, which is incorporated herein by reference.
  • a scalable storage system includes at least one host device ( 100 ) coupled to pluggable storage module 190 via coupling 101 .
  • Coupling 101 might be implemented as a transmission medium, such as a backplane, copper cables, optical fibers, one or more coaxial cables, one or more twisted pair copper wires, and/or one or more radio frequency (RF) channels.
  • RF radio frequency
  • coupling 101 might be implemented as an FC, AoE, iSCSI, or HyperSCSI link (e.g., in a SAN system) or as an NFS or SMB/CIFS link (e.g., in a NAS system).
  • Pluggable storage module 190 includes at least one host/storage device interface (shown as 180 ). Although shown in FIG. 1 as being integrated with pluggable storage module 190 , in some embodiments, host/storage device interface 180 might be integrated with each host device 100 . In some embodiments, pluggable storage module 190 might be implemented as an add-in card. As shown in FIG. 1 , pluggable storage module 190 includes host-visible storage 110 , which includes one or more storage devices 110 ( 1 )- 110 (N). Host-visible storage 110 implements storage, part or all of which is configured to allow access by host devices 100 via host to storage device interface 180 .
  • Pluggable storage module 190 also includes host-invisible storage 120 , which includes one or more storage devices 120 ( 1 )- 120 (M).
  • Host-invisible storage 120 implements storage that is not directly reported and, thus, “invisible,” to host devices 100 . However, the storage that is invisible to the host is reported and is indirectly accessible to host devices 100 by elements of host-visible storage 110 , for example via a peer-to-peer protocol. For example, a primary agent of the storage elements reports the combined storage capacity of the primary agent and any secondary agents in communication with the primary agent, even though the secondary agents are not visible to host device 100 .
  • storage devices 110 and 120 are physical storage devices, such as Solid State Disks (SSDs), Hard Disk Drives (HDDs), tape libraries, hybrid magnetic and solid state storage systems, or some combination thereof.
  • combinations of couplings 101 , 111 and 121 enable request, status, and data transfers between host devices 100 and host-visible storage 110 (and host-invisible storage 120 via host-visible storage 110 ).
  • one or more of the couplings enable transfers via a host-interface protocol, for example by one of host devices 100 operating as a master and one of the storage elements of host-visible storage 110 operating as a slave.
  • one or more of the couplings enable transfers via a peer-to-peer protocol, for example by one of the elements of host-visible storage 110 operating as a primary agent and one of the elements of host-invisible storage 120 or another one of the elements of host-visible storage 110 operating as a secondary agent.
  • Couplings 111 and 121 might be implemented as custom-designed communication links, or might be implemented as links conforming to a standard communication protocol such as, for example, a Small Computer System Interface (SCSI) link, a Serial Attached SCSI (SAS) link, a Serial Advanced Technology Attachment (SATA) link, a Universal Serial Bus (USB), a Fibre Channel (FC) link, an Ethernet link (e.g., a 10GE link), an IEEE 802.11 link, an IEEE 802.15 link, an IEEE 802.16 link, a Peripheral Component Interconnect Express (PCI-E) link, a Serial Rapid I/O (SRIO) link, an InfiniBand link, or other similar interface link.
  • SCSI Small Computer System Interface
  • SAS Serial Attached SCSI
  • SAS Serial Advanced Technology Attachment
  • USB Universal Serial Bus
  • FC Fibre Channel
  • Ethernet link e.g., a 10GE link
  • IEEE 802.11 link e.g., an IEEE 802.15 link
  • host/storage device interface 180 might typically be implemented as one or more PCI-E or InfiniBand switches such that host device 100 , coupling 101 and host/storage device interface 180 implement a unified switch.
  • the unified switch is operable as a transparent switch with respect to host-visible storage 110 and also simultaneously operable as a non-transparent switch with respect to host-invisible storage 120 .
  • the PCI-E switch e.g., host/storage device interface 180
  • the PCI-E switch is a separate element distinct from each of storage devices 110 and 120 .
  • non-transparent switch employing the non-transparent switch, described embodiments could select one of the storage devices to act as a master device (e.g., a primary agent) to handle all host communication with all the storage devices, and select the rest of the storage devices to act as slave devices (e.g., as a secondary agent) that are hidden from the host device, even though all of storage devices 110 and 120 might be duplicate devices. Further, the aggregate group of storage devices might appear as a single storage device to the host device.
  • a master device e.g., a primary agent
  • slave devices e.g., as a secondary agent
  • Other described embodiments can provide scalable functionality without employing a separate PCI-E switch by employing “neighbor-to-neighbor” communication such that communications employ point-to-point links between each of the storage devices without a need for a higher level (e.g., a PCI-E hierarchy).
  • neighborhbor-to-neighbor communication such that communications employ point-to-point links between each of the storage devices without a need for a higher level (e.g., a PCI-E hierarchy).
  • routing or switching all of the storage devices are able to communicate among each other even though all the connections are point-to-point between the storage devices.
  • FIG. 2 shows a block diagram of an exemplary storage device 110 .
  • Host device 100 is coupled to storage device 110 via coupling 101 .
  • Coupling 100 is in communication with PHY interface 202 .
  • PHY interface 202 includes one or more upstream physical layer links or ports (PHYs) (shown as 101 ) and one or more downstream PHYs (shown as 218 ( 1 )- 218 (N)).
  • PHYs physical layer links or ports
  • storage device 110 includes a mass storage device 216 that includes one or more of solid-state storage 210 (e.g., an SSD), magnetic storage 212 (e.g., an HDD or tape library) and optical storage 214 (e.g., a CD or DVD).
  • solid-state storage 210 e.g., an SSD
  • magnetic storage 212 e.g., an HDD or tape library
  • optical storage 214 e.g., a CD or DVD
  • Storage device 110 includes storage interface 206 , which communicates to each individual storage device 210 , 212 and 214 .
  • Logical/Physical translation module 204 translates between logical addresses for operations received from host device 100 and physical addresses on mass storage 216 .
  • Storage device 110 also includes sub-status module 222 and sub-request module 220 , both of which are in communication with PHY interface 202 .
  • the upstream PHYs (e.g., 101 ) are in communication with a host device (e.g., 100 ) via the PCI-E hierarchy, and downstream PHYs (e.g., 218 ) are in communication with other storage devices (e.g., multiple of 110 ).
  • Exemplary embodiments might employ a fixed number of configurable PHYs, for example, 8 total configurable PHYs, where a given PHY might be configured as an upstream link or a downstream link. Having configurable PHYs allows for a trade-off between bandwidth delivered to host 101 (e.g., upstream connectivity) and capacity of the scalable storage system (e.g., downstream connectivity).
  • Other embodiments might employ a fixed number of upstream PHYs and a fixed number of downstream PHYs, for example, 2 upstream PHYs and 6 downstream PHYs.
  • some or all of PHYs 101 and 218 of a storage device might be operable at the same speed (e.g., a same maximum speed) or might each be operable at different speeds.
  • some embodiments might allow each of PHYs 101 and 218 to independently support any one or more of: PCI-E Gen1, Gen2, Gen3 or Gen4, 10GE, InfiniBand, SAS, SATA, or a nonstandard protocol for communication with one or more storage devices.
  • Each of PHYs 101 and 218 are coupled to one or more respective PHY interfaces integrated within each storage device 110 .
  • PHY interface 202 is a PCI-E interface
  • the PCI-E interface is configurable to communicate as one or more of: a root complex; a forwarding point; and an endpoint.
  • a forwarding point is similar to a root complex in that a forwarding point can send and receive traffic among one or more PCI-E interfaces.
  • a root complex is additionally a root of a separate PCI-E hierarchy. Since a host device (e.g., 100 ) coupled to one or more storage devices (e.g., 110 ) is itself a root complex, if one or more of the storage devices coupled to the host also is a root complex, then a multi-root PCI-E hierarchy is created.
  • FIGS. 3-5 show block diagrams of exemplary point-to-point connections of multiple storage devices in scalable storage systems in accordance with exemplary embodiments.
  • the PHYs and the PHY controllers might be coupled via: a daisy chain (or optionally a loop) as shown in FIG. 3 ; a fixed, 1-to-1 interconnection to a host device (shown in FIG. 4 ); a full crossbar topology; a partial crossbar topology; a multiplexor network; a combination thereof; or any other technique for coupling multiple hardware devices.
  • connection network among the storage devices is a switched network, while in others, the connection network among the storage devices is a routed network. Further, in some embodiments, at least some of storage devices 110 have a different configuration of PHY, or one or more different types of PHYs (e.g., PCI-E, 10GE, InfiniBand, SAS, SATA, etc.).
  • FIGS. 3 and 4 storage devices 110 (A)- 110 (N) of FIG. 3 , and storage devices 110 ( 1 )- 110 (N) of FIG. 4 have internal PHY interfaces configured as forwarding points.
  • FIG. 5 shows a hierarchical coupling where all of storage devices 110 have PHY interfaces that are configured as forwarding points, except storage devices 110 .Z 1 through 110 .ZN, which have PHY interfaces configured as endpoints.
  • one or more of storage devices 110 e.g., storage device 110 (A) of FIG. 3 , 110 ( 1 )- 110 (N) of FIG. 4 , 110 .A of FIG.
  • host device 100 is coupled to host device 100 , and all of the storage devices are coupled directly to host device 100 (e.g., as shown in FIG. 4 ), or are coupled indirectly to host device 100 via others of the storage devices, without employing, for example, a PCI-E switch.
  • At least one of storage devices 110 acts as a primary agent, and at least one or more of storage devices 110 act as secondary agents.
  • the one or more primary agents have a direct, more direct, shorter, and/or lower latency connection with host device 100 than the secondary agents.
  • storage device 110 (A) might act as the primary agent for storage devices 110 (B)- 110 (N), since, for example, storage device 110 (A) has a direct connection to host device 100 , while storage devices 110 (B)- 110 (N) are coupled to one another in a daisy chain.
  • FIG. 3 storage device 110 (A) might act as the primary agent for storage devices 110 (B)- 110 (N), since, for example, storage device 110 (A) has a direct connection to host device 100 , while storage devices 110 (B)- 110 (N) are coupled to one another in a daisy chain.
  • all of storage devices 110 ( 1 )- 110 (N) are able to act as primary agents for themselves, as each storage device 110 ( 1 )- 110 (N) has a direct connection to host device 100 .
  • Each storage device having a direct connection to the host advantageously enables bandwidth to/from the host to scale linearly with a number of the storage devices.
  • having a subset of the storage devices, such as just one of the storage devices, act as a primary agent and the others as secondary agents enables scalable capacity without a need for the host to control a plurality of separate storage devices. As shown in FIG.
  • storage device 110 .A might act as the primary agent for storage devices 110 .B 1 - 110 .Bn, since, for example, storage device 110 .A has a direct connection to host device 100 , while storage device 110 .B 1 might act as a primary agent for storage devices (not shown) coupled via couplings 218 (C 1 ), and so on.
  • all communication between primary agents and secondary agents is performed as neighbor-to-neighbor traffic that is not visible to host device 100 (and, thus, not visible to the PCI-E hierarchy of host device 100 ).
  • all of the neighbor-to-neighbor traffic is performed on couplings 218 ( 1 )- 218 (N), and none of the neighbor-to-neighbor traffic is performed on connection 101 which couples storage devices 110 to host device 100 .
  • FIG. 3 shows that all of the neighbor-to-neighbor traffic that is not visible to host device 100 (and, thus, not visible to the PCI-E hierarchy of host device 100 ).
  • the neighbor-to-neighbor traffic is control traffic, such as the forwarding of commands received by a primary agent from host device 100 to a specific one of storage devices 110 and responses (e.g., completions), back to a primary agent from the specific one of storage devices 110 , information derived from commands received from host device 100 , maintenance traffic such as synchronization or heartbeats; RAID or other data redundancy control or data traffic (e.g., deltas for RAID), and other traffic.
  • control traffic such as the forwarding of commands received by a primary agent from host device 100 to a specific one of storage devices 110 and responses (e.g., completions), back to a primary agent from the specific one of storage devices 110 , information derived from commands received from host device 100 , maintenance traffic such as synchronization or heartbeats; RAID or other data redundancy control or data traffic (e.g., deltas for RAID), and other traffic.
  • the particular storage device when a write command updates a part of a RAID stripe on a particular one of storage devices 110 , the particular storage device sends a RAID delta to one or more of the other storage devices (e.g., the one of storage devices storing the RAID parity of the stripe) as neighbor-to-neighbor traffic.
  • the other storage devices e.g., the one of storage devices storing the RAID parity of the stripe
  • Couplings 101 and 218 are optionally or selectively of different bandwidths and/or different protocols.
  • upstream connections e.g., coupling 101
  • downstream connections e.g., couplings 218
  • PCI-E Gen3 or a different protocol, such as 10GE, InfiniBand, SAS, etc.
  • Any of the couplings might have a different bandwidth or a different number of physical links from each other.
  • control traffic of any of couplings 101 and 218 might be transferred over relatively lower-bandwidth sideband couplings, while data traffic might be transferred over relatively higher-bandwidth main band couplings.
  • any of couplings 101 and 218 might be implemented as custom-designed communication links, or might be implemented as links conforming to a standard communication protocol such as, for example, SCSI, SAS, SATA, USB, FC, Ethernet (e.g., 10GE), IEEE 802.11, IEEE 802.15, IEEE 802.16, PCI-E, SRIO, InfiniBand, or other similar interface link.
  • a bandwidth upstream to host device 100 is substantially equal to an aggregate deliverable bandwidth of the various storage devices 110 .
  • storage devices 110 that are communicatively closer to host device 100 e.g., storage device 110 .A
  • storage devices 110 .Z 1 are configured for a higher bandwidth than storage devices communicatively farther from host device 100 (e.g., storage device 110 .Z 1 ).
  • each of storage devices 110 might have different capacities, capabilities, or be implemented as different types of storage media, such as Solid State Disks (SSDs), Hard Disk Drives (HDDs), Magnetoresistive Random Access Memory (MRAM), tape libraries, hybrid magnetic and solid state storage systems, or some combination thereof.
  • SSDs Solid State Disks
  • HDDs Hard Disk Drives
  • MRAM Magnetoresistive Random Access Memory
  • a connection network among storage devices 110 uses a PCI-E protocol (or other standard protocol) but in nonstandard ways, such as by having a circular (loop) interconnection (e.g., as indicated by optional coupling 218 (N) in FIGS. 3 and 4 ).
  • the connection network among storage devices 110 is enabled to use nonstandard bandwidths, signaling, commands or protocol extensions to advantageously improve performance.
  • the connection network among the storage devices 110 is enabled to provide inter-device communication in a manner efficient in one or more of bandwidth, latency, and power.
  • a primary agent of one or more storage devices receives a host request including a logical address from a host coupled to the primary agent.
  • the primary agent determines, based on the logical address, a corresponding physical address in at least one of the storage devices and generates, based on the physical address, a sub-request for each determined physical address in the storage devices.
  • the primary agent sends, via a storage device interface network operable independently of the host, the sub-requests to the storage devices.
  • the storage device interface network is a peer-to-peer network coupling the storage devices to the primary agent.
  • the primary agent receives sub-statuses in response to the sub-requests, and determines an overall status.
  • the primary agent provides the overall status to the host such that the host is coupled to the storage devices without a switch.
  • exemplary is used herein to mean serving as an example, instance, or illustration. Any aspect or design described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other aspects or designs. Rather, use of the word exemplary is intended to present concepts in a concrete fashion.
  • Described embodiments might also be embodied in the form of methods and apparatuses for practicing those methods. Described embodiments might also be embodied in the form of program code embodied in non-transitory tangible media, such as magnetic recording media, optical recording media, solid state memory, floppy diskettes, CD-ROMs, hard drives, or any other non-transitory machine-readable storage medium, wherein, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing described embodiments.
  • non-transitory tangible media such as magnetic recording media, optical recording media, solid state memory, floppy diskettes, CD-ROMs, hard drives, or any other non-transitory machine-readable storage medium, wherein, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing described embodiments.
  • Described embodiments might can also be embodied in the form of program code, for example, whether stored in a non-transitory machine-readable storage medium, loaded into and/or executed by a machine, or transmitted over some transmission medium or carrier, such as over electrical wiring or cabling, through fiber optics, or via electromagnetic radiation, wherein, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the described embodiments.
  • the program code segments combine with the processor to provide a unique device that operates analogously to specific logic circuits.
  • Described embodiments might also be embodied in the form of a bitstream or other sequence of signal values electrically or optically transmitted through a medium, stored magnetic-field variations in a magnetic recording medium, etc., generated using a method and/or an apparatus of the described embodiments.
  • the term “compatible” means that the element communicates with other elements in a manner wholly or partially specified by the standard, and would be recognized by other elements as sufficiently capable of communicating with the other elements in the manner specified by the standard.
  • the compatible element does not need to operate internally in a manner specified by the standard. Unless explicitly stated otherwise, each numerical value and range should be interpreted as being approximate as if the word “about” or “approximately” preceded the value of the value or range.
  • Couple refers to any manner known in the art or later developed in which energy is allowed to be transferred between two or more elements, and the interposition of one or more additional elements is contemplated, although not required.
  • the terms “directly coupled,” “directly connected,” etc. imply the absence of such additional elements. Signals and corresponding nodes or ports might be referred to by the same name and are interchangeable for purposes here.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Mathematical Physics (AREA)
  • Computer Hardware Design (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Computer And Data Communications (AREA)
  • Hardware Redundancy (AREA)

Abstract

Described embodiments access data in a chained, scalable storage system. A primary agent of one or more storage devices receives a host request including a logical address from a host coupled to the primary agent. The primary agent determines, based on the logical address, a corresponding physical address in at least one of the storage devices and generates, based on the physical address, a sub-request for each determined physical address in the storage devices. The primary agent sends, via a storage device interface network operable independently of the host, the sub-requests to the storage devices. The storage device interface network is a peer-to-peer network coupling the storage devices to the primary agent. The primary agent receives sub-statuses in response to the sub-requests, and determines an overall status. The primary agent provides the overall status to the host such that the host is coupled to the storage devices without a switch.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application is a continuation-in-part, and claims the benefit of the filing date, of U.S. patent application Ser. No. 13/702,976, filed Dec. 7, 2012, which claims the benefit of the filing date of U.S. provisional application No. 61/497,525 filed Jun. 16, 2011, International Patent Application no. PCT/US2011/040996 filed Jun. 17, 2011, and U.S. provisional application No. 61/356,443 filed Jun. 18, 2010, the teachings of all which are incorporated herein in their entireties by reference.
  • BACKGROUND
  • A Storage Area Network (SAN) is a system that provides access to consolidated, block-level storage, such as disk arrays and tape libraries, to one or more host devices coupled to the SAN. A SAN represents a plurality of storage devices as a single logical interface to the host devices, conceptually aggregating the storage implemented by each of the storage devices into a single logical storage space. A typical SAN might be scalable, meaning that the amount of storage space (e.g., the number of storage devices) can be changed as needed in different SAN systems. As noted, a SAN provides block-level access, meaning that the file system is typically managed by the host devices. A typical SAN might employ block-level protocols such as Fibre Channel (FC), Advanced Technology Attachment (ATA) over Ethernet (AoE), Internet Small Computer System Interface (iSCSI) or HyperSCSI. A SAN directly transfers data between storage devices and host devices.
  • A Network Attached Storage (NAS) is a system that provides file-level access to one or more host devices coupled to the NAS. Unlike a SAN, the NAS system provides a file system for its attached storage devices, essentially acting as a file server accessing one or more local block-level storage devices. A typical NAS might employ file-level protocols such as Network File System (NFS) or Server Message Block/Common Internet File System (SMB/CIFS). A SAN-NAS hybrid system is a system that provides hosts with both file-level access like a NAS device and block-level access like a SAN system from the same storage system.
  • In SAN, NAS and SAN-NAS hybrid systems, it is desired to employ multiple storage devices such that the size of total system storage can be increased by grouping together a plurality of storage devices. Such grouping of storage devices typically requires communication hierarchy with a switch such that the storage devices are available to the host, either individually or in aggregate.
  • SUMMARY
  • This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
  • Described embodiments access data in a chained, scalable storage system. A primary agent of one or more storage devices receives a host request including a logical address from a host coupled to the primary agent. The primary agent determines, based on the logical address, a corresponding physical address in at least one of the storage devices and generates, based on the physical address, a sub-request for each determined physical address in the storage devices. The primary agent sends, via a storage device interface network operable independently of the host, the sub-requests to the storage devices. The storage device interface network is a peer-to-peer network coupling the storage devices to the primary agent. The primary agent receives sub-statuses in response to the sub-requests, and determines an overall status. The primary agent provides the overall status to the host such that the host is coupled to the storage devices without a switch.
  • BRIEF DESCRIPTION OF THE DRAWING FIGURES
  • Other aspects, features, and advantages of described embodiments will become more fully apparent from the following detailed description, the appended claims, and the accompanying drawings in which like reference numerals identify similar or identical elements.
  • FIG. 1 shows a block diagram of a scalable storage system in accordance with exemplary embodiments;
  • FIG. 2 shows a block diagram of a scalable storage system in accordance with exemplary embodiments;
  • FIG. 3 shows a block diagram of a scalable storage system in accordance with exemplary embodiments;
  • FIG. 4 shows a block diagram of a scalable storage system in accordance with exemplary embodiments; and
  • FIG. 5 shows a block diagram of a scalable storage system in accordance with exemplary embodiments.
  • DETAILED DESCRIPTION
  • Described embodiments access data in a chained, scalable storage system. A primary agent of one or more storage devices receives a host request including a logical address from a host coupled to the primary agent. The primary agent determines, based on the logical address, a corresponding physical address in at least one of the storage devices and generates, based on the physical address, a sub-request for each determined physical address in the storage devices. The primary agent sends, via a storage device interface network operable independently of the host, the sub-requests to the storage devices. The storage device interface network is a peer-to-peer network coupling the storage devices to the primary agent. The primary agent receives sub-statuses in response to the sub-requests, and determines an overall status. The primary agent provides the overall status to the host such that the host is coupled to the storage devices without a switch.
  • Table 1 defines a list of acronyms employed throughout this specification as an aid to understanding the described embodiments:
  • TABLE 1
    AoE Advanced Technology CD Compact Disc
    Attachment (ATA) over Ethernet DVD Digital Versatile Disc
    CIFS Common Internet File System HDD Hard Disk Drive
    FC Fibre Channel IC Integrated Circuit
    HIF Host InterFace iSCSI Internet SCSI
    I/O Input/Output NAS Network Attached Storage
    MRAM Magnetoresistive Random Access PCI-E Peripheral Component Interconnect
    Memory Express
    NFS Network File System RAID Redundant Array of Independent
    PHY PHysical Layer Disks
    RF Radio Frequency SAN Storage Area Network
    SAS Serial Attached SCSI SATA Serial Advanced Technology
    SCSI Small Computer System Interface Attachment
    SoC System on Chip SMB Server Message Block
    SSD Solid-State Disk SRIO Serial Rapid Input/Output
    USB Universal Serial Bus
  • In some SAN, NAS or SAN-NAS hybrid systems, the storage devices might have a primary agent of the devices accept storage requests received from host devices over a host-interface (HIF) protocol. The primary agent processes the host requests and generates one or more sub-requests to secondary agents of each storage device over a peer-to-peer protocol. The secondary agents accept and process the sub-requests, and report sub-status information for each of the sub-requests to the primary agent and/or the host. The primary agent optionally accumulates the sub-statuses into an overall status of the host request. Peer-to-peer communication between the agents is optionally used to communicate redundancy information during host accesses and/or failure recoveries. Various failure recovery techniques might reallocate storage, reassign agents and recover data via redundancy information.
  • FIG. 1 shows a block diagram of an exemplary scalable storage system, for example as described in related U.S. patent application Ser. No. 13/702,976, filed Dec. 7, 2012, which is incorporated herein by reference. As shown in FIG. 1, a scalable storage system includes at least one host device (100) coupled to pluggable storage module 190 via coupling 101. Coupling 101 might be implemented as a transmission medium, such as a backplane, copper cables, optical fibers, one or more coaxial cables, one or more twisted pair copper wires, and/or one or more radio frequency (RF) channels. For example, coupling 101 might be implemented as an FC, AoE, iSCSI, or HyperSCSI link (e.g., in a SAN system) or as an NFS or SMB/CIFS link (e.g., in a NAS system).
  • Pluggable storage module 190 includes at least one host/storage device interface (shown as 180). Although shown in FIG. 1 as being integrated with pluggable storage module 190, in some embodiments, host/storage device interface 180 might be integrated with each host device 100. In some embodiments, pluggable storage module 190 might be implemented as an add-in card. As shown in FIG. 1, pluggable storage module 190 includes host-visible storage 110, which includes one or more storage devices 110(1)-110(N). Host-visible storage 110 implements storage, part or all of which is configured to allow access by host devices 100 via host to storage device interface 180. Pluggable storage module 190 also includes host-invisible storage 120, which includes one or more storage devices 120(1)-120(M). Host-invisible storage 120 implements storage that is not directly reported and, thus, “invisible,” to host devices 100. However, the storage that is invisible to the host is reported and is indirectly accessible to host devices 100 by elements of host-visible storage 110, for example via a peer-to-peer protocol. For example, a primary agent of the storage elements reports the combined storage capacity of the primary agent and any secondary agents in communication with the primary agent, even though the secondary agents are not visible to host device 100. In some embodiments, storage devices 110 and 120 are physical storage devices, such as Solid State Disks (SSDs), Hard Disk Drives (HDDs), tape libraries, hybrid magnetic and solid state storage systems, or some combination thereof.
  • Together, combinations of couplings 101, 111 and 121 enable request, status, and data transfers between host devices 100 and host-visible storage 110 (and host-invisible storage 120 via host-visible storage 110). For example, one or more of the couplings enable transfers via a host-interface protocol, for example by one of host devices 100 operating as a master and one of the storage elements of host-visible storage 110 operating as a slave. Further, one or more of the couplings enable transfers via a peer-to-peer protocol, for example by one of the elements of host-visible storage 110 operating as a primary agent and one of the elements of host-invisible storage 120 or another one of the elements of host-visible storage 110 operating as a secondary agent. Couplings 111 and 121 might be implemented as custom-designed communication links, or might be implemented as links conforming to a standard communication protocol such as, for example, a Small Computer System Interface (SCSI) link, a Serial Attached SCSI (SAS) link, a Serial Advanced Technology Attachment (SATA) link, a Universal Serial Bus (USB), a Fibre Channel (FC) link, an Ethernet link (e.g., a 10GE link), an IEEE 802.11 link, an IEEE 802.15 link, an IEEE 802.16 link, a Peripheral Component Interconnect Express (PCI-E) link, a Serial Rapid I/O (SRIO) link, an InfiniBand link, or other similar interface link.
  • In some embodiments, host/storage device interface 180 might typically be implemented as one or more PCI-E or InfiniBand switches such that host device 100, coupling 101 and host/storage device interface 180 implement a unified switch. In further embodiments, the unified switch is operable as a transparent switch with respect to host-visible storage 110 and also simultaneously operable as a non-transparent switch with respect to host-invisible storage 120. As shown in FIG. 1, the PCI-E switch (e.g., host/storage device interface 180) is a separate element distinct from each of storage devices 110 and 120.
  • Thus, related U.S. patent application Ser. No. 13/702,976, filed Dec. 7, 2012, incorporated herein by reference, describes a scalable storage system including one or more PCI-E or InfiniBand switches (e.g., host/storage device interface 180). If the PCI-E switch is a non-transparent switch, details of the topology below the switch and specifics of the configuration of individual storage devices is hidden from the host device (e.g., on host initialization discovery of attached devices). Thus, employing the non-transparent switch, described embodiments could select one of the storage devices to act as a master device (e.g., a primary agent) to handle all host communication with all the storage devices, and select the rest of the storage devices to act as slave devices (e.g., as a secondary agent) that are hidden from the host device, even though all of storage devices 110 and 120 might be duplicate devices. Further, the aggregate group of storage devices might appear as a single storage device to the host device.
  • Other described embodiments can provide scalable functionality without employing a separate PCI-E switch by employing “neighbor-to-neighbor” communication such that communications employ point-to-point links between each of the storage devices without a need for a higher level (e.g., a PCI-E hierarchy). By techniques such as routing or switching, all of the storage devices are able to communicate among each other even though all the connections are point-to-point between the storage devices.
  • FIG. 2 shows a block diagram of an exemplary storage device 110. Host device 100 is coupled to storage device 110 via coupling 101. Coupling 100 is in communication with PHY interface 202. As shown in FIG. 2, PHY interface 202 includes one or more upstream physical layer links or ports (PHYs) (shown as 101) and one or more downstream PHYs (shown as 218(1)-218(N)). As shown in FIG. 2, storage device 110 includes a mass storage device 216 that includes one or more of solid-state storage 210 (e.g., an SSD), magnetic storage 212 (e.g., an HDD or tape library) and optical storage 214 (e.g., a CD or DVD). Storage device 110 includes storage interface 206, which communicates to each individual storage device 210, 212 and 214. Logical/Physical translation module 204 translates between logical addresses for operations received from host device 100 and physical addresses on mass storage 216. Storage device 110 also includes sub-status module 222 and sub-request module 220, both of which are in communication with PHY interface 202.
  • In described embodiments, the upstream PHYs (e.g., 101) are in communication with a host device (e.g., 100) via the PCI-E hierarchy, and downstream PHYs (e.g., 218) are in communication with other storage devices (e.g., multiple of 110). Exemplary embodiments might employ a fixed number of configurable PHYs, for example, 8 total configurable PHYs, where a given PHY might be configured as an upstream link or a downstream link. Having configurable PHYs allows for a trade-off between bandwidth delivered to host 101 (e.g., upstream connectivity) and capacity of the scalable storage system (e.g., downstream connectivity). Other embodiments might employ a fixed number of upstream PHYs and a fixed number of downstream PHYs, for example, 2 upstream PHYs and 6 downstream PHYs.
  • In various embodiments, some or all of PHYs 101 and 218 of a storage device (e.g., 110) might be operable at the same speed (e.g., a same maximum speed) or might each be operable at different speeds. For example, some embodiments might allow each of PHYs 101 and 218 to independently support any one or more of: PCI-E Gen1, Gen2, Gen3 or Gen4, 10GE, InfiniBand, SAS, SATA, or a nonstandard protocol for communication with one or more storage devices. Each of PHYs 101 and 218 are coupled to one or more respective PHY interfaces integrated within each storage device 110. When, for example, PHY interface 202 is a PCI-E interface, the PCI-E interface is configurable to communicate as one or more of: a root complex; a forwarding point; and an endpoint. A forwarding point is similar to a root complex in that a forwarding point can send and receive traffic among one or more PCI-E interfaces. A root complex is additionally a root of a separate PCI-E hierarchy. Since a host device (e.g., 100) coupled to one or more storage devices (e.g., 110) is itself a root complex, if one or more of the storage devices coupled to the host also is a root complex, then a multi-root PCI-E hierarchy is created.
  • Multiple of storage device 110 might be connected in any number of different ways. FIGS. 3-5 show block diagrams of exemplary point-to-point connections of multiple storage devices in scalable storage systems in accordance with exemplary embodiments. As shown, in various embodiments the PHYs and the PHY controllers might be coupled via: a daisy chain (or optionally a loop) as shown in FIG. 3; a fixed, 1-to-1 interconnection to a host device (shown in FIG. 4); a full crossbar topology; a partial crossbar topology; a multiplexor network; a combination thereof; or any other technique for coupling multiple hardware devices. In some embodiments, the connection network among the storage devices is a switched network, while in others, the connection network among the storage devices is a routed network. Further, in some embodiments, at least some of storage devices 110 have a different configuration of PHY, or one or more different types of PHYs (e.g., PCI-E, 10GE, InfiniBand, SAS, SATA, etc.).
  • As shown in FIGS. 3 and 4, storage devices 110(A)-110(N) of FIG. 3, and storage devices 110(1)-110(N) of FIG. 4 have internal PHY interfaces configured as forwarding points. FIG. 5 shows a hierarchical coupling where all of storage devices 110 have PHY interfaces that are configured as forwarding points, except storage devices 110.Z1 through 110.ZN, which have PHY interfaces configured as endpoints. Thus, in described embodiments, one or more of storage devices 110 (e.g., storage device 110(A) of FIG. 3, 110(1)-110(N) of FIG. 4, 110.A of FIG. 5) is coupled to host device 100, and all of the storage devices are coupled directly to host device 100 (e.g., as shown in FIG. 4), or are coupled indirectly to host device 100 via others of the storage devices, without employing, for example, a PCI-E switch.
  • At least one of storage devices 110 acts as a primary agent, and at least one or more of storage devices 110 act as secondary agents. In various embodiments, the one or more primary agents have a direct, more direct, shorter, and/or lower latency connection with host device 100 than the secondary agents. For example, as shown in FIG. 3, storage device 110(A) might act as the primary agent for storage devices 110(B)-110(N), since, for example, storage device 110(A) has a direct connection to host device 100, while storage devices 110(B)-110(N) are coupled to one another in a daisy chain. As shown in FIG. 4, all of storage devices 110(1)-110(N) are able to act as primary agents for themselves, as each storage device 110(1)-110(N) has a direct connection to host device 100. Each storage device having a direct connection to the host advantageously enables bandwidth to/from the host to scale linearly with a number of the storage devices. Further, having a subset of the storage devices, such as just one of the storage devices, act as a primary agent and the others as secondary agents enables scalable capacity without a need for the host to control a plurality of separate storage devices. As shown in FIG. 5, storage device 110.A might act as the primary agent for storage devices 110.B1-110.Bn, since, for example, storage device 110.A has a direct connection to host device 100, while storage device 110.B1 might act as a primary agent for storage devices (not shown) coupled via couplings 218(C1), and so on.
  • In described embodiments, all communication between primary agents and secondary agents is performed as neighbor-to-neighbor traffic that is not visible to host device 100 (and, thus, not visible to the PCI-E hierarchy of host device 100). For example, as shown in FIG. 3, all of the neighbor-to-neighbor traffic is performed on couplings 218(1)-218(N), and none of the neighbor-to-neighbor traffic is performed on connection 101 which couples storage devices 110 to host device 100. Similarly, as shown in FIG. 4, all of the neighbor-to-neighbor traffic is performed on couplings 218(1)-218(N), and none of the neighbor-to-neighbor traffic is performed on couplings 101(1)-101(N) coupling storage devices 110(1)-110(N) to host device 100. Similarly, as shown in FIG. 5, all of the neighbor-to-neighbor traffic is performed on couplings 218(B1)-218(Zn), and none of the neighbor-to-neighbor traffic is performed on coupling 101 coupling storage device 110.A to host device 100.
  • In described embodiments, the neighbor-to-neighbor traffic is control traffic, such as the forwarding of commands received by a primary agent from host device 100 to a specific one of storage devices 110 and responses (e.g., completions), back to a primary agent from the specific one of storage devices 110, information derived from commands received from host device 100, maintenance traffic such as synchronization or heartbeats; RAID or other data redundancy control or data traffic (e.g., deltas for RAID), and other traffic. For example, when a write command updates a part of a RAID stripe on a particular one of storage devices 110, the particular storage device sends a RAID delta to one or more of the other storage devices (e.g., the one of storage devices storing the RAID parity of the stripe) as neighbor-to-neighbor traffic.
  • Couplings 101 and 218, as shown in FIGS. 3-5, are optionally or selectively of different bandwidths and/or different protocols. For example, upstream connections (e.g., coupling 101) to host device 100 might typically be PCI-E Gen4, while downstream connections (e.g., couplings 218) among the various storage devices 110 might typically be PCI-E Gen3 or a different protocol, such as 10GE, InfiniBand, SAS, etc. Any of the couplings might have a different bandwidth or a different number of physical links from each other. In some embodiments, control traffic of any of couplings 101 and 218 might be transferred over relatively lower-bandwidth sideband couplings, while data traffic might be transferred over relatively higher-bandwidth main band couplings. Thus, in some embodiments, any of couplings 101 and 218 might be implemented as custom-designed communication links, or might be implemented as links conforming to a standard communication protocol such as, for example, SCSI, SAS, SATA, USB, FC, Ethernet (e.g., 10GE), IEEE 802.11, IEEE 802.15, IEEE 802.16, PCI-E, SRIO, InfiniBand, or other similar interface link.
  • In some embodiments, such as shown in FIG. 4, a bandwidth upstream to host device 100 is substantially equal to an aggregate deliverable bandwidth of the various storage devices 110. In some embodiments, such as shown in FIG. 5, storage devices 110 that are communicatively closer to host device 100 (e.g., storage device 110.A) are configured for a higher bandwidth than storage devices communicatively farther from host device 100 (e.g., storage device 110.Z1). In some embodiments, each of storage devices 110 might have different capacities, capabilities, or be implemented as different types of storage media, such as Solid State Disks (SSDs), Hard Disk Drives (HDDs), Magnetoresistive Random Access Memory (MRAM), tape libraries, hybrid magnetic and solid state storage systems, or some combination thereof.
  • In some embodiments, a connection network among storage devices 110 uses a PCI-E protocol (or other standard protocol) but in nonstandard ways, such as by having a circular (loop) interconnection (e.g., as indicated by optional coupling 218(N) in FIGS. 3 and 4). In further embodiments, the connection network among storage devices 110 is enabled to use nonstandard bandwidths, signaling, commands or protocol extensions to advantageously improve performance. In general, the connection network among the storage devices 110 is enabled to provide inter-device communication in a manner efficient in one or more of bandwidth, latency, and power.
  • Thus, as described herein, described embodiments access data in a chained, scalable storage system. A primary agent of one or more storage devices receives a host request including a logical address from a host coupled to the primary agent. The primary agent determines, based on the logical address, a corresponding physical address in at least one of the storage devices and generates, based on the physical address, a sub-request for each determined physical address in the storage devices. The primary agent sends, via a storage device interface network operable independently of the host, the sub-requests to the storage devices. The storage device interface network is a peer-to-peer network coupling the storage devices to the primary agent. The primary agent receives sub-statuses in response to the sub-requests, and determines an overall status. The primary agent provides the overall status to the host such that the host is coupled to the storage devices without a switch.
  • Reference herein to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments necessarily mutually exclusive of other embodiments. The same applies to the term “implementation.”
  • As used in this application, the word “exemplary” is used herein to mean serving as an example, instance, or illustration. Any aspect or design described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other aspects or designs. Rather, use of the word exemplary is intended to present concepts in a concrete fashion.
  • While the exemplary embodiments have been described with respect to processing blocks in a software program, including possible implementation as a digital signal processor, micro-controller, or general-purpose computer, described embodiments are not so limited. As would be apparent to one skilled in the art, various functions of software might also be implemented as processes of circuits. Such circuits might be employed in, for example, a single integrated circuit, a multi-chip module, a single card, or a multi-card circuit pack.
  • Described embodiments might also be embodied in the form of methods and apparatuses for practicing those methods. Described embodiments might also be embodied in the form of program code embodied in non-transitory tangible media, such as magnetic recording media, optical recording media, solid state memory, floppy diskettes, CD-ROMs, hard drives, or any other non-transitory machine-readable storage medium, wherein, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing described embodiments. Described embodiments might can also be embodied in the form of program code, for example, whether stored in a non-transitory machine-readable storage medium, loaded into and/or executed by a machine, or transmitted over some transmission medium or carrier, such as over electrical wiring or cabling, through fiber optics, or via electromagnetic radiation, wherein, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the described embodiments. When implemented on a general-purpose processor, the program code segments combine with the processor to provide a unique device that operates analogously to specific logic circuits. Described embodiments might also be embodied in the form of a bitstream or other sequence of signal values electrically or optically transmitted through a medium, stored magnetic-field variations in a magnetic recording medium, etc., generated using a method and/or an apparatus of the described embodiments.
  • It should be understood that the steps of the exemplary methods set forth herein are not necessarily required to be performed in the order described, and the order of the steps of such methods should be understood to be merely exemplary. Likewise, additional steps might be included in such methods, and certain steps might be omitted or combined, in methods consistent with various described embodiments.
  • As used herein in reference to an element and a standard, the term “compatible” means that the element communicates with other elements in a manner wholly or partially specified by the standard, and would be recognized by other elements as sufficiently capable of communicating with the other elements in the manner specified by the standard. The compatible element does not need to operate internally in a manner specified by the standard. Unless explicitly stated otherwise, each numerical value and range should be interpreted as being approximate as if the word “about” or “approximately” preceded the value of the value or range.
  • Also for purposes of this description, the terms “couple,” “coupling,” “coupled,” “connect,” “connecting,” or “connected” refer to any manner known in the art or later developed in which energy is allowed to be transferred between two or more elements, and the interposition of one or more additional elements is contemplated, although not required. Conversely, the terms “directly coupled,” “directly connected,” etc., imply the absence of such additional elements. Signals and corresponding nodes or ports might be referred to by the same name and are interchangeable for purposes here.
  • It will be further understood that various changes in the details, materials, and arrangements of the parts that have been described and illustrated in order to explain the nature of the described embodiments might be made by those skilled in the art without departing from the scope expressed in the following claims.

Claims (20)

We claim:
1. A method of accessing data in a chained, scalable storage system, the method comprising:
receiving, by a primary agent of one or more storage devices, a host request from a host device coupled to the primary agent via a host interface network, the request to access a logical address of the one or more storage devices;
determining, by the primary agent based on the logical address, a corresponding physical address in at least one of the one or more storage devices;
generating, by the primary agent based on the physical address, a sub-request corresponding to the host request and each of the determined corresponding physical addresses in at least one of the one or more storage devices;
sending, by the primary agent via a storage device interface network operable independently of the host device, the sub-requests to the at least one storage device, the storage device interface network a peer-to-peer network coupling the storage devices to the primary agent; and
receiving, by the primary agent from the at least one storage device, respective sub-statuses in response to the sub-requests, determining an overall status based on each respective sub-status, and providing the overall status to the host device,
wherein the host device is coupled to the one or more storage devices without employing a network switch.
2. The method of claim 1, wherein the storage device interface network is not directly accessible to the host interface network.
3. The method of claim 2, further comprising:
sending, by each of the storage devices, data communication via a respective separate data communication path with the host separate from the storage device interface network,
whereby control traffic between the host device and the storage devices is solely between the host device and the primary agent, while data communication bandwidth scales with a number of the storage devices.
4. The method of claim 1, wherein, for the method, the host interface network and the storage device interface network comprise transmission media comprising at least one of: a backplane, one or more copper cables, one or more optical fibers, one or more coaxial cables, one or more twisted pair copper wires.
5. The method of claim 4, further comprising:
selectively providing higher bandwidth storage device interface network connections to a subset of the one or more storage devices.
6. The method of claim 5, wherein the subset of the one or more storage devices comprises one or more of the storage devices located proximately to the host device.
7. The method of claim 4, wherein the host interface network comprises a Peripheral Component Interconnect Express (PCI-E) network.
8. The method of claim 7, wherein the host interface network comprises a PCI-E Gen4 network, and the storage device interconnect network comprises one or more of: a PCI-E Gen3 network, an Ethernet network, a Serial Attached Small Computer System Interface (SAS) network, and a Serial Advanced Technology Attachment (SATA) network.
9. The method of claim 1, wherein, for the method, the one or more storage devices comprise at least one of: a Solid State Disk (SSD), a Hard Disk Drive (HDD), a Magnetoresistive Random Access Memory (MRAM), a tape library, and a hybrid magnetic and solid state storage system.
10. The method of claim 1, further comprising:
providing a bandwidth to the host interface network that is related to an aggregate deliverable bandwidth of the one or more storage devices.
11. The method of claim 10, wherein the storage device interface network comprises one or more physical links, each link having an independent bandwidth.
12. The method of claim 11, wherein each of the one or more physical links comprise (i) a relatively lower-bandwidth sideband coupling for transferring control data, and (ii) a relatively higher-bandwidth main band coupling for transferring user data.
13. The method of claim 10, wherein the providing comprises providing each of the storage devices with a separate physical link of the host interface network.
14. The method of claim 1, further comprising:
employing the one or more storage devices in a Redundant Array of Independent Disks (RAID) system.
15. A chained, scalable storage system comprising:
a plurality of storage devices, at least one of the storage devices a primary agent for one or more of the plurality of storage devices;
a host device coupled via a host interface network to the at least one primary agent,
wherein the at least one primary agent is configured to:
receive a host request from the host device, the request to access a logical address of the one or more of the plurality of storage devices;
determine, based on the logical address, a corresponding physical address in at least one of the one or more of the plurality of storage devices;
generate, based on the physical address, a sub-request corresponding to the host request and each of the determined corresponding physical addresses in at least one of the one or more of the plurality of storage devices;
send, via a storage device interface network operable independently of the host device, the sub-requests to the at least one storage device, the storage device interface network a peer-to-peer network coupling the storage devices to the primary agent; and
receive, from the at least one storage device, respective sub-statuses in response to the sub-requests, determine an overall status based on each respective sub-status, and provide the overall status to the host device,
wherein the host device is coupled to the one or more storage devices without employing a network switch.
16. The system of claim 15, wherein the storage device interface network is not directly accessible to the host interface network.
17. The system of claim 16, wherein control traffic between the host device and the storage devices is solely between the host device and the at least one primary agent, and data bandwidth scales with a number of the storage devices.
18. The system of claim 15, wherein the storage device interface network is configured to, at least one of:
selectively provide higher bandwidth connections to a subset of the one or more storage devices; and
provide a bandwidth to the host interface network that is related to an aggregate deliverable bandwidth of the one or more storage devices.
19. The system of claim 15, wherein:
the host interface network comprises a Peripheral Component Interconnect Express (PCI-E) Gen4 network;
the storage device interconnect network comprises one or more of: a PCI-E Gen3 network, an Ethernet network, a Serial Attached Small Computer System Interface (SAS) network, and a Serial Advanced Technology Attachment (SATA) network; and
the one or more storage devices comprise at least one of: a Solid State Disk (SSD), a Hard Disk Drive (HDD), a Magnetoresistive Random Access Memory (MRAM), a tape library, a hybrid magnetic and solid state storage system; and a Redundant Array of Independent Disks (RAID).
20. A non-transitory machine-readable medium, having encoded thereon program code, wherein, when the program code is executed by a machine, the machine implements a method of accessing data in a chained, scalable storage system, the method comprising:
receiving, by a primary agent of one or more storage devices, a host request from a host device coupled to the primary agent via a host interface network, the request to access a logical address of the one or more storage devices;
determining, by the primary agent based on the logical address, a corresponding physical address in at least one of the one or more storage devices;
generating, by the primary agent based on the physical address, a sub-request corresponding to the host request and each of the determined corresponding physical addresses in at least one of the one or more storage devices;
sending, by the primary agent via a storage device interface network operable independently of the host device, the sub-requests to the at least one storage device, the storage device interface network a peer-to-peer network coupling the storage devices to the primary agent; and
receiving, by the primary agent from the at least one storage device, respective sub-statuses in response to the sub-requests, determining an overall status based on each respective sub-status, and providing the overall status to the host device,
wherein the host device is coupled to the one or more storage devices without employing a network switch.
US13/765,253 2010-06-18 2013-02-12 Chained, scalable storage devices Abandoned US20130159622A1 (en)

Priority Applications (6)

Application Number Priority Date Filing Date Title
US13/765,253 US20130159622A1 (en) 2010-06-18 2013-02-12 Chained, scalable storage devices
TW103102356A TWI614670B (en) 2013-02-12 2014-01-22 Chained, scalable storage system and method of accessing data in a chained, scalable storage system
JP2014019112A JP2014154157A (en) 2013-02-12 2014-02-04 Chained, scalable storage devices
EP14153954.4A EP2765501A1 (en) 2013-02-12 2014-02-05 Chained, scalable storage devices
KR1020140014704A KR102171716B1 (en) 2013-02-12 2014-02-10 Chained, scalable storage devices
CN201410047792.8A CN103984638A (en) 2013-02-12 2014-02-11 Chained, scalable storage devices

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US35644310P 2010-06-18 2010-06-18
US201161497525P 2011-06-16 2011-06-16
PCT/US2011/040996 WO2011160094A2 (en) 2010-06-18 2011-06-17 Scalable storage devices
US201213702976A 2012-12-07 2012-12-07
US13/765,253 US20130159622A1 (en) 2010-06-18 2013-02-12 Chained, scalable storage devices

Related Parent Applications (2)

Application Number Title Priority Date Filing Date
PCT/US2011/040996 Continuation WO2011160094A2 (en) 2010-06-18 2011-06-17 Scalable storage devices
US201213702976A Continuation 2010-06-18 2012-12-07

Publications (1)

Publication Number Publication Date
US20130159622A1 true US20130159622A1 (en) 2013-06-20

Family

ID=45348918

Family Applications (3)

Application Number Title Priority Date Filing Date
US13/702,976 Expired - Fee Related US8677068B2 (en) 2010-06-18 2011-06-17 Scalable storage devices
US13/765,253 Abandoned US20130159622A1 (en) 2010-06-18 2013-02-12 Chained, scalable storage devices
US14/197,010 Active US9116624B2 (en) 2010-06-18 2014-03-04 Scalable storage devices

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US13/702,976 Expired - Fee Related US8677068B2 (en) 2010-06-18 2011-06-17 Scalable storage devices

Family Applications After (1)

Application Number Title Priority Date Filing Date
US14/197,010 Active US9116624B2 (en) 2010-06-18 2014-03-04 Scalable storage devices

Country Status (7)

Country Link
US (3) US8677068B2 (en)
EP (1) EP2583184A4 (en)
JP (1) JP5957647B2 (en)
KR (2) KR101491484B1 (en)
CN (2) CN104166441B (en)
TW (1) TWI475379B (en)
WO (1) WO2011160094A2 (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170052711A1 (en) * 2015-05-21 2017-02-23 International Business Machines Corporation Data compression for grid-oriented storage systems
US20180107392A1 (en) * 2016-10-19 2018-04-19 Aliane Technologies Co., Memory management system and method thereof
US10248362B2 (en) * 2013-01-17 2019-04-02 Western Digital Technologies, Inc. Data management for a data storage device
US10372361B2 (en) 2014-02-27 2019-08-06 Mitsubishi Electric Corporation Data storage device including multiple memory modules and circuitry to manage communication among the multiple memory modules
US20190347222A1 (en) * 2018-05-11 2019-11-14 Seagate Technology Llc Data storage device with front end bus
US10628364B2 (en) 2017-11-17 2020-04-21 Samsung Electronics Co., Ltd. Dual port storage device performing peer-to-peer communication with external device without intervention of host
US10817214B2 (en) 2018-06-07 2020-10-27 Samsung Electronics Co., Ltd. Storage device set including storage device and reconfigurable logic chip, and storage system including storage device set
US20210011854A1 (en) * 2014-07-02 2021-01-14 Pure Storage, Inc. Distributed storage addressing
US11146636B2 (en) 2020-01-03 2021-10-12 Samsung Electronics Co., Ltd. Method of operating network-based storage device, method of operating storage system using the same and storage module performing the same
US11216194B2 (en) 2016-10-19 2022-01-04 Aliane Technologies Corporation Memory management system and method thereof
WO2024005930A1 (en) * 2022-06-27 2024-01-04 Western Digital Technologies, Inc. Peer raid control among peer data storage devices

Families Citing this family (86)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8671265B2 (en) 2010-03-05 2014-03-11 Solidfire, Inc. Distributed data storage system providing de-duplication of data using block identifiers
US9838269B2 (en) 2011-12-27 2017-12-05 Netapp, Inc. Proportional quality of service based on client usage and system metrics
US9054992B2 (en) 2011-12-27 2015-06-09 Solidfire, Inc. Quality of service policy sets
CN104126181A (en) * 2011-12-30 2014-10-29 英特尔公司 Thin translation for system access of non volatile semicondcutor storage as random access memory
US9875001B2 (en) * 2012-08-26 2018-01-23 Avaya Inc. Network device management and visualization
WO2014077823A2 (en) * 2012-11-15 2014-05-22 Empire Technology Development Llc A scalable storage system having multiple storage channels
US9122515B2 (en) 2012-12-19 2015-09-01 Dell Products L.P. Completion notification for a storage device
CN103902472B (en) * 2012-12-28 2018-04-20 华为技术有限公司 Internal storage access processing method, memory chip and system based on memory chip interconnection
TWI614670B (en) * 2013-02-12 2018-02-11 Lsi公司 Chained, scalable storage system and method of accessing data in a chained, scalable storage system
US9348537B2 (en) * 2013-09-10 2016-05-24 Qualcomm Incorporated Ascertaining command completion in flash memories
JP2016539397A (en) * 2013-11-27 2016-12-15 インテル・コーポレーション Method and apparatus for server platform architecture to enable serviceable non-volatile memory modules
US10067829B2 (en) 2013-12-13 2018-09-04 Intel Corporation Managing redundancy information in a non-volatile memory
US9430599B2 (en) * 2014-02-18 2016-08-30 Optima Design Automation Ltd Determining soft error infliction probability
US20150244795A1 (en) 2014-02-21 2015-08-27 Solidfire, Inc. Data syncing in a distributed system
KR102318478B1 (en) * 2014-04-21 2021-10-27 삼성전자주식회사 Storage controller, storage system and method of operation of the storage controller
KR101512181B1 (en) * 2014-04-23 2015-04-16 주식회사 백프로 SAS Data converting system with internal storage module
US9652415B2 (en) 2014-07-09 2017-05-16 Sandisk Technologies Llc Atomic non-volatile memory data transfer
KR101459750B1 (en) * 2014-07-15 2014-11-13 주식회사 백프로 SAS Data converting system to provide stability
US9904621B2 (en) 2014-07-15 2018-02-27 Sandisk Technologies Llc Methods and systems for flash buffer sizing
US9645744B2 (en) 2014-07-22 2017-05-09 Sandisk Technologies Llc Suspending and resuming non-volatile memory operations
KR102173089B1 (en) 2014-08-08 2020-11-04 삼성전자주식회사 Interface circuit and packet transmission method thereof
JP2016057876A (en) * 2014-09-10 2016-04-21 富士通株式会社 Information processing apparatus, input / output control program, and input / output control method
US9753649B2 (en) 2014-10-27 2017-09-05 Sandisk Technologies Llc Tracking intermix of writes and un-map commands across power cycles
US9558125B2 (en) 2014-10-27 2017-01-31 Sandisk Technologies Llc Processing of un-map commands to enhance performance and endurance of a storage device
US9952978B2 (en) 2014-10-27 2018-04-24 Sandisk Technologies, Llc Method for improving mixed random performance in low queue depth workloads
US9817752B2 (en) 2014-11-21 2017-11-14 Sandisk Technologies Llc Data integrity enhancement to protect against returning old versions of data
US9824007B2 (en) 2014-11-21 2017-11-21 Sandisk Technologies Llc Data integrity enhancement to protect against returning old versions of data
CN104486384A (en) * 2014-11-28 2015-04-01 华为技术有限公司 Storage system and exchange expansion device
US20160202924A1 (en) * 2015-01-13 2016-07-14 Telefonaktiebolaget L M Ericsson (Publ) Diagonal organization of memory blocks in a circular organization of memories
US10912428B2 (en) * 2015-02-12 2021-02-09 Visibelle Derma Institute, Inc. Tip for skin cleansing device
US9647697B2 (en) 2015-03-16 2017-05-09 Sandisk Technologies Llc Method and system for determining soft information offsets
US9652175B2 (en) * 2015-04-09 2017-05-16 Sandisk Technologies Llc Locally generating and storing RAID stripe parity with single relative memory address for storing data segments and parity in multiple non-volatile memory portions
US9753653B2 (en) 2015-04-14 2017-09-05 Sandisk Technologies Llc High-priority NAND operations management
US9864545B2 (en) 2015-04-14 2018-01-09 Sandisk Technologies Llc Open erase block read automation
US10372529B2 (en) 2015-04-20 2019-08-06 Sandisk Technologies Llc Iterative soft information correction and decoding
US9778878B2 (en) 2015-04-22 2017-10-03 Sandisk Technologies Llc Method and system for limiting write command execution
US9870149B2 (en) 2015-07-08 2018-01-16 Sandisk Technologies Llc Scheduling operations in non-volatile memory devices using preference values
US9715939B2 (en) 2015-08-10 2017-07-25 Sandisk Technologies Llc Low read data storage management
US10228990B2 (en) 2015-11-12 2019-03-12 Sandisk Technologies Llc Variable-term error metrics adjustment
US10108377B2 (en) 2015-11-13 2018-10-23 Western Digital Technologies, Inc. Storage processing unit arrays and methods of use
TWI597953B (en) * 2015-11-25 2017-09-01 財團法人工業技術研究院 Pcie network system with failover capability and operation method thereof
US10126970B2 (en) 2015-12-11 2018-11-13 Sandisk Technologies Llc Paired metablocks in non-volatile storage device
US9837146B2 (en) 2016-01-08 2017-12-05 Sandisk Technologies Llc Memory system temperature management
US10732856B2 (en) 2016-03-03 2020-08-04 Sandisk Technologies Llc Erase health metric to rank memory portions
US10929022B2 (en) 2016-04-25 2021-02-23 Netapp. Inc. Space savings reporting for storage system supporting snapshot and clones
WO2017185322A1 (en) * 2016-04-29 2017-11-02 华为技术有限公司 Storage network element discovery method and device
US20170329640A1 (en) * 2016-05-10 2017-11-16 HGST Netherlands B.V. Systems and methods for designating storage processing units as communication hubs and allocating processing tasks in a storage processor array
US10481830B2 (en) 2016-07-25 2019-11-19 Sandisk Technologies Llc Selectively throttling host reads for read disturbs in non-volatile memory system
US10642763B2 (en) 2016-09-20 2020-05-05 Netapp, Inc. Quality of service policy sets
KR102631351B1 (en) * 2016-10-07 2024-01-31 삼성전자주식회사 Storage device capable of performing peer-to-peer communication and data storage system including the same
US10628196B2 (en) * 2016-11-12 2020-04-21 Vmware, Inc. Distributed iSCSI target for distributed hyper-converged storage
CN108614746A (en) * 2016-12-09 2018-10-02 中国移动通信有限公司研究院 A kind of data processing method and its system, server
US10255134B2 (en) 2017-01-20 2019-04-09 Samsung Electronics Co., Ltd. Control plane method and apparatus for providing erasure code protection across multiple storage devices
US10860508B2 (en) * 2017-05-25 2020-12-08 Western Digital Technologies, Inc. Offloaded disaggregated storage architecture
US10430333B2 (en) * 2017-09-29 2019-10-01 Intel Corporation Storage system with interconnected solid state disks
US10608951B2 (en) * 2017-09-30 2020-03-31 Oracle International Corporation Live resegmenting of partitions in distributed stream-processing platforms
US10419265B2 (en) * 2017-11-29 2019-09-17 Bank Of America Corporation Request processing system using a combining engine
KR102446733B1 (en) 2017-11-30 2022-09-23 삼성전자주식회사 Storage devices and electronic devices including storage devices
CN108365926A (en) * 2018-01-17 2018-08-03 北京和利时智能技术有限公司 A kind of novel redundant system
US10740181B2 (en) 2018-03-06 2020-08-11 Western Digital Technologies, Inc. Failed storage device rebuild method
US10860446B2 (en) 2018-04-26 2020-12-08 Western Digital Technologiies, Inc. Failed storage device rebuild using dynamically selected locations in overprovisioned space
EP3792743A4 (en) 2018-06-30 2021-06-30 Huawei Technologies Co., Ltd. NVME-BASED DATA WRITING PROCESS, DEVICE AND SYSTEM
US10725941B2 (en) 2018-06-30 2020-07-28 Western Digital Technologies, Inc. Multi-device storage system with hosted services on peer storage devices
WO2020000482A1 (en) * 2018-06-30 2020-01-02 华为技术有限公司 Nvme-based data reading method, apparatus and system
US10409511B1 (en) * 2018-06-30 2019-09-10 Western Digital Technologies, Inc. Multi-device storage system with distributed read/write processing
US10824526B2 (en) * 2018-08-03 2020-11-03 Western Digital Technologies, Inc. Using failed storage device in peer-to-peer storage system to perform storage-centric task
US10592144B2 (en) 2018-08-03 2020-03-17 Western Digital Technologies, Inc. Storage system fabric with multichannel compute complex
US10831603B2 (en) * 2018-08-03 2020-11-10 Western Digital Technologies, Inc. Rebuild assist using failed storage device
US10649843B2 (en) * 2018-08-03 2020-05-12 Western Digital Technologies, Inc. Storage systems with peer data scrub
US10901848B2 (en) 2018-08-03 2021-01-26 Western Digital Technologies, Inc. Storage systems with peer data recovery
TWI710909B (en) * 2018-09-28 2020-11-21 普安科技股份有限公司 Storage system architecture with a plurality of data storage subsystems each having a compatible module, and a method thereof
EP3857859B1 (en) 2018-11-16 2023-07-19 VMWare, Inc. Active-active architecture for distributed iscsi target in hyper-converged storage
US11182258B2 (en) 2019-01-04 2021-11-23 Western Digital Technologies, Inc. Data rebuild using dynamic peer work allocation
JP6942163B2 (en) 2019-08-06 2021-09-29 株式会社日立製作所 Drive box, storage system and data transfer method
US11311801B2 (en) * 2019-08-09 2022-04-26 Sony Interactive Entertainment LLC Methods for using high-speed data communication fabric to enable cross-system command buffer reading for data retrieval in cloud gaming
US11500667B2 (en) 2020-01-22 2022-11-15 Vmware, Inc. Object-based approaches to support internet small computer system interface (ISCSI) services in distributed storage system
US11507409B2 (en) 2020-01-22 2022-11-22 Vmware, Inc. Object-based load balancing approaches in distributed storage system
US11316917B2 (en) 2020-03-04 2022-04-26 Samsung Electronics Co., Ltd. Methods and apparatus for peer-to-peer data channels for storage devices
US11734131B2 (en) * 2020-04-09 2023-08-22 Micron Technology, Inc. Memory device having redundant media management capabilities
US11372785B2 (en) * 2020-05-06 2022-06-28 Microsoft Technology Licensing, Llc Local non-volatile memory express virtualization device
US11468169B1 (en) 2021-04-28 2022-10-11 Dell Products L.P. Dark storage support for as-a-service model
US12327018B2 (en) * 2021-10-07 2025-06-10 Samsung Electronics Co., Ltd. Systems, methods, and devices for near storage elasticity
US11734207B1 (en) * 2022-02-02 2023-08-22 Western Digital Technologies, Inc. Dynamic port allocation in PCIe bifurcation system
CN114461557B (en) * 2022-03-17 2025-01-07 山东云海国创云计算装备产业创新中心有限公司 Interface expansion device and method
US11983428B2 (en) * 2022-06-07 2024-05-14 Western Digital Technologies, Inc. Data migration via data storage device peer channel
CN117687889B (en) * 2024-01-31 2024-04-05 苏州元脑智能科技有限公司 Performance test device and method for memory expansion equipment

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6654831B1 (en) * 2000-03-07 2003-11-25 International Business Machine Corporation Using multiple controllers together to create data spans

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6304980B1 (en) 1996-03-13 2001-10-16 International Business Machines Corporation Peer-to-peer backup system with failure-triggered device switching honoring reservation of primary device
US6442709B1 (en) 1999-02-09 2002-08-27 International Business Machines Corporation System and method for simulating disaster situations on peer to peer remote copy machines
US20040103163A1 (en) * 2002-11-27 2004-05-27 Hao-Hsing Lin Serial bus disk extender and portable storage device
WO2004090788A2 (en) * 2003-04-03 2004-10-21 Commvault Systems, Inc. System and method for dynamically performing storage operations in a computer network
US7574529B2 (en) * 2004-06-22 2009-08-11 International Business Machines Corporation Addressing logical subsystems in a data storage system
TWI274353B (en) * 2005-02-21 2007-02-21 Transcend Information Inc Flash disk with expansion capacity, stackable mobile storage device and control circuit thereof
JP4990505B2 (en) * 2005-04-04 2012-08-01 株式会社日立製作所 Storage control device and storage system
WO2007014296A2 (en) * 2005-07-25 2007-02-01 Parascale, Inc. Scalable distributed file storage access and management
CN103744790B (en) * 2005-08-25 2017-05-10 美国莱迪思半导体公司 Smart scalable storage switch architecture
TWI286690B (en) * 2005-08-29 2007-09-11 Via Tech Inc Expanded structure of peripheral storage device having a connector port multiplier
TWI285304B (en) * 2005-10-20 2007-08-11 Quanta Comp Inc Extendable storage apparatus for blade server system
US7685227B2 (en) * 2006-11-10 2010-03-23 Gerber Robert H Message forwarding backup manager in a distributed server system
JP2009175824A (en) * 2008-01-22 2009-08-06 Hitachi Ltd Mainframe storage controller and mainframe volume virtualization method
US8090907B2 (en) * 2008-07-09 2012-01-03 International Business Machines Corporation Method for migration of synchronous remote copy service to a virtualization appliance
US8234470B2 (en) * 2009-08-25 2012-07-31 International Business Machines Corporation Data repository selection within a storage environment

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6654831B1 (en) * 2000-03-07 2003-11-25 International Business Machine Corporation Using multiple controllers together to create data spans

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10248362B2 (en) * 2013-01-17 2019-04-02 Western Digital Technologies, Inc. Data management for a data storage device
US10372361B2 (en) 2014-02-27 2019-08-06 Mitsubishi Electric Corporation Data storage device including multiple memory modules and circuitry to manage communication among the multiple memory modules
US12135654B2 (en) * 2014-07-02 2024-11-05 Pure Storage, Inc. Distributed storage system
US20210011854A1 (en) * 2014-07-02 2021-01-14 Pure Storage, Inc. Distributed storage addressing
US10223000B2 (en) * 2015-05-21 2019-03-05 International Business Machines Corporation Data compression for grid-oriented storage systems
US20170052711A1 (en) * 2015-05-21 2017-02-23 International Business Machines Corporation Data compression for grid-oriented storage systems
US11216194B2 (en) 2016-10-19 2022-01-04 Aliane Technologies Corporation Memory management system and method thereof
US20180107392A1 (en) * 2016-10-19 2018-04-19 Aliane Technologies Co., Memory management system and method thereof
US10628364B2 (en) 2017-11-17 2020-04-21 Samsung Electronics Co., Ltd. Dual port storage device performing peer-to-peer communication with external device without intervention of host
US11816055B2 (en) 2017-11-17 2023-11-14 Samsung Electronics Co., Ltd. Storage device performing peer-to-peer communication with external device without intervention of host
US11055251B2 (en) 2017-11-17 2021-07-06 Samsung Electronics Co., Ltd. Storage device performing peer-to-peer communication with external device without intervention of host
US20190347222A1 (en) * 2018-05-11 2019-11-14 Seagate Technology Llc Data storage device with front end bus
US10929319B2 (en) * 2018-05-11 2021-02-23 Seagate Technology Llc Data storage device with front end bus
US11461043B2 (en) 2018-06-07 2022-10-04 Samsung Electronics Co., Ltd. Storage device set including storage device and reconfigurable logic chip, and storage system including storage device set
US12061818B2 (en) 2018-06-07 2024-08-13 Samsung Electronics Co., Ltd. Storage device set including storage device and reconfigurable logic chip, and storage system including storage device set
US10817214B2 (en) 2018-06-07 2020-10-27 Samsung Electronics Co., Ltd. Storage device set including storage device and reconfigurable logic chip, and storage system including storage device set
US11146636B2 (en) 2020-01-03 2021-10-12 Samsung Electronics Co., Ltd. Method of operating network-based storage device, method of operating storage system using the same and storage module performing the same
US11516292B2 (en) 2020-01-03 2022-11-29 Samsung Electronics Co., Ltd. Method of operating network-based storage device, method of operating storage system using the same and storage module performing the same
US11765229B2 (en) 2020-01-03 2023-09-19 Samsung Electronics Co., Ltd. Method of operating network-based storage device, method of operating storage system using the same and storage module performing the same
WO2024005930A1 (en) * 2022-06-27 2024-01-04 Western Digital Technologies, Inc. Peer raid control among peer data storage devices
US12019917B2 (en) 2022-06-27 2024-06-25 Western Digital Technologies, Inc. Peer RAID control among peer data storage devices

Also Published As

Publication number Publication date
KR20130055632A (en) 2013-05-28
US20140258598A1 (en) 2014-09-11
CN104166441A (en) 2014-11-26
JP5957647B2 (en) 2016-07-27
TW201214105A (en) 2012-04-01
US9116624B2 (en) 2015-08-25
EP2583184A4 (en) 2014-07-09
CN104166441B (en) 2018-05-11
US8677068B2 (en) 2014-03-18
WO2011160094A2 (en) 2011-12-22
KR101491484B1 (en) 2015-02-10
JP2013532339A (en) 2013-08-15
KR101466592B1 (en) 2014-12-01
WO2011160094A3 (en) 2012-04-26
US20130086336A1 (en) 2013-04-04
KR20140107487A (en) 2014-09-04
EP2583184A2 (en) 2013-04-24
TWI475379B (en) 2015-03-01
CN103080917A (en) 2013-05-01
CN103080917B (en) 2014-08-20

Similar Documents

Publication Publication Date Title
US20130159622A1 (en) Chained, scalable storage devices
KR102171716B1 (en) Chained, scalable storage devices
US7525957B2 (en) Input/output router for storage networks
JP7105870B2 (en) Data access method, device and system
US9342413B2 (en) SAS RAID head
CN1971497A (en) Redundant storage virtualization subsystem with SAS interface on the host side and its system
KR20160060119A (en) Method and apparatus for storing data
CN109947376B (en) Multi-protocol interface solid-state storage system based on FPGA
CN103929475B (en) The hard disk storage system and hard disc data operating method of a kind of Ethernet architecture
CN110609659A (en) NVMeoF RAID implementation method for executing read commands
CN105045336A (en) JBOD (Just Bunch of Disks)
JP2014154157A5 (en)
CN206249150U (en) A kind of storage server
CN102843284A (en) iSCSI storage node, framework and read-write method
US9021166B2 (en) Server direct attached storage shared through physical SAS expanders
JP6358483B2 (en) Apparatus and method for routing information in a non-volatile memory-based storage device
US9477414B1 (en) Methods and systems for improved caching with data recovery
KR101564712B1 (en) A system of all flash array storage virtualisation using SCST
CN105700817A (en) Just Bunch of Discs JBOD apparatus
US11784916B2 (en) Intelligent control plane communication
CN202206413U (en) iSCSI storage node, architecture
US9477424B1 (en) Methods and systems for using an intelligent storage adapter for replication in a clustered environment
US8856404B2 (en) Primitive group data encoding in a data storage fabric
WO2010008366A1 (en) System to connect a serial scsi array controller to a storage area network
US10289576B2 (en) Storage system, storage apparatus, and communication method

Legal Events

Date Code Title Description
AS Assignment

Owner name: LSI CORPORATION, CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:COHEN, EARL T.;REEL/FRAME:029797/0872

Effective date: 20130212

AS Assignment

Owner name: DEUTSCHE BANK AG NEW YORK BRANCH, AS COLLATERAL AGENT, NEW YORK

Free format text: PATENT SECURITY AGREEMENT;ASSIGNORS:LSI CORPORATION;AGERE SYSTEMS LLC;REEL/FRAME:032856/0031

Effective date: 20140506

Owner name: DEUTSCHE BANK AG NEW YORK BRANCH, AS COLLATERAL AG

Free format text: PATENT SECURITY AGREEMENT;ASSIGNORS:LSI CORPORATION;AGERE SYSTEMS LLC;REEL/FRAME:032856/0031

Effective date: 20140506

AS Assignment

Owner name: AGERE SYSTEMS LLC, PENNSYLVANIA

Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN CERTAIN PATENTS INCLUDED IN SECURITY INTEREST PREVIOUSLY RECORDED AT REEL/FRAME (032856/0031);ASSIGNOR:DEUTSCHE BANK AG NEW YORK BRANCH, AS COLLATERAL AGENT;REEL/FRAME:034177/0257

Effective date: 20140902

Owner name: LSI CORPORATION, CALIFORNIA

Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN CERTAIN PATENTS INCLUDED IN SECURITY INTEREST PREVIOUSLY RECORDED AT REEL/FRAME (032856/0031);ASSIGNOR:DEUTSCHE BANK AG NEW YORK BRANCH, AS COLLATERAL AGENT;REEL/FRAME:034177/0257

Effective date: 20140902

AS Assignment

Owner name: SEAGATE TECHNOLOGY LLC, CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:LSI CORPORATION;REEL/FRAME:034770/0859

Effective date: 20140902

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO PAY ISSUE FEE

AS Assignment

Owner name: LSI CORPORATION, CALIFORNIA

Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENT RIGHTS (RELEASES RF 032856-0031);ASSIGNOR:DEUTSCHE BANK AG NEW YORK BRANCH, AS COLLATERAL AGENT;REEL/FRAME:037684/0039

Effective date: 20160201

Owner name: AGERE SYSTEMS LLC, PENNSYLVANIA

Free format text: TERMINATION AND RELEASE OF SECURITY INTEREST IN PATENT RIGHTS (RELEASES RF 032856-0031);ASSIGNOR:DEUTSCHE BANK AG NEW YORK BRANCH, AS COLLATERAL AGENT;REEL/FRAME:037684/0039

Effective date: 20160201